Sitemap - 2023 - Gonzo ML
PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU
For Distillation, Tokens Are Not All You Need
Toward understanding the communication in sperm whales
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
[S4] Efficiently Modeling Long Sequences with Structured State Spaces
Conway’s Game of Life is Omniperiodic
[Google] Gemini: A Family of Highly Capable Multimodal Models
System 2 Attention (is something you might need too)
🪆Matryoshka Representation Learning
Mindstorms in Natural Language-Based Societies of Mind
The convolution empire strikes back
"Building Machines That Learn and Think Like People", 7 years later
Chain-of-Thought → Tree-of-Thought
Turing, “Intelligent Machinery, A Heretical Theory”, 1951
Generative Agents: Interactive Simulacra of Human Behavior
Uncovering mesa-optimization algorithms in Transformers
Textbooks Are All You Need II: phi-1.5 technical report