Inverse Scaling: When Bigger Isn't Better
The 2023 paper cataloguing tasks where larger language models do worse, complicating the assumption that scale always helps.
What the papers actually said - linked to the originals.
The 2023 paper cataloguing tasks where larger language models do worse, complicating the assumption that scale always helps.
The 2023 Microsoft paper introducing phi-1, a 1.3B code model that beat far larger models by training on 'textbook-quality' data, launching the Phi family.
An open toolkit, dataset, and retrieval-augmented prover for AI theorem proving in the Lean proof assistant.
SDXL was a larger, higher-quality successor to Stable Diffusion using a bigger backbone, dual text encoders, and a refiner.
Huawei's Pangu-Weather, published in Nature in 2023, used 3D deep networks to beat the leading operational forecast system while running in seconds.
Study showing models use information best at the start and end of a long context, and often miss facts buried in the middle.
The Baker lab's 2023 Nature paper introduced RFdiffusion, a diffusion model that designs new proteins from scratch.
An MIT experiment found ChatGPT cut professional writing time by 40% and raised quality 18%, helping weaker writers most.
A 2023 framework where LLM agents play software roles and build programs through a chat chain across design, coding, and testing.
RetNet's retention mechanism supports parallel training and O(1)-memory recurrent inference in one architecture.
Method that frames constrained decoding as a finite-state machine, forcing model output to match a regex or grammar cheaply.
The 2023 survey systematizing where RLHF breaks down - flawed human feedback, imperfect reward models, and brittle policy optimization.
The 2023 paper that automatically generated adversarial text suffixes which jailbreak aligned LLMs and transfer to ChatGPT, Bard, and Claude.
RT-2 turned a web-trained vision-language model into a robot policy by emitting actions as text tokens, gaining new reasoning.
A 2023 paper and dataset (ToolBench) that taught an open model to call over 16,000 real REST APIs.
A 2023 framework that runs an LLM agent team like a software company, encoding standard operating procedures into the agents' prompts.
A 2023 paper estimating how much language models cut the cost of producing propaganda for online influence operations.
The 2023 paper that replaced neural radiance fields with millions of 3D Gaussians for real-time, high-quality novel-view rendering.
Microsoft's 2023 framework for building LLM applications out of conversable agents that talk to each other, humans, and tools.
A 2023 report derives indicators of consciousness from neuroscience and finds no current AI system meets them.
The 2023 paper showing you can steer a model's behavior by adding a contrast-derived vector to its activations during the forward pass, no retraining.
A Stanford BrainGate2 study decoded attempted speech from cortical electrodes, reaching usable accuracy on a 125,000-word vocabulary.
A vision-based autonomous drone trained with deep RL beat human world-champion pilots in real head-to-head races.
Method to stretch a trained model's context window far beyond its original limit using a fraction of the usual fine-tuning.
The 2023 Google paper showing AI-generated preference labels can match human ones for RLHF, with a direct variant skipping the reward model.
The 2023 paper introducing PagedAttention and vLLM, a serving system that raised LLM inference throughput 2-4x by managing the KV cache like virtual memory.
The 2023 Meta paper where a model drafts an answer, asks itself verification questions, answers them independently, then revises.
The 2023 paper showing LLMs trained on 'A is B' often fail to answer 'B is A', exposing a basic generalization gap.
Finding that keeping a few initial tokens as attention sinks lets models stream very long inputs without fine-tuning.
Method that splits a long sequence across devices and overlaps communication with compute to scale context with device count.
The 2023 Anthropic paper using sparse autoencoders to split a one-layer model's neurons into thousands of clean, single-meaning features.
A 2023 method that gives language agents Monte Carlo tree search, so they can plan, act, and reflect by exploring many paths.
The 2023 Google DeepMind paper on step-back prompting, asking a model to abstract to general principles before solving the specific problem.
A 2023 Berkeley paper that borrowed OS virtual-memory ideas to give LLM agents persistent memory beyond their context window.
A 21-institution collaboration pooled data from 22 robots, showing one policy trained across embodiments transfers between them.
RAG framework where the model decides when to retrieve and uses reflection tokens to critique passages and its own output.
NVIDIA's Eureka used GPT-4 to write reward code by evolution, beating human-designed rewards on 83 percent of 29 RL tasks.
A follow-up to Glaze that lets artists 'poison' images so models scraping them without consent learn corrupted concepts.
A 2023 audit that traced the licenses and lineage of over 1,800 text datasets and found widespread license misattribution in AI training data.
A 2023 DeepMind paper proposes a five-level scale for AGI, ranked by both performance depth and breadth of generality.
A Google paper introduced NeuralGCM, a hybrid that pairs a physics solver with learned components for both weather forecasts and decade-long climate runs.
Stability AI's Stable Video Diffusion turned a latent image diffusion model into an open video generator via staged training.
The 2023 paper whose 'divergence attack' made ChatGPT spit out memorized training data by asking it to repeat a word forever.
DeepMind's GNoME paper, in Nature in 2023, used graph networks to predict 2.2 million crystals, 380,000 of them newly predicted stable materials.
The 2023 Gu-Dao paper introducing Mamba, a selective state-space architecture that scales linearly with sequence length and rivals Transformers.
System and language for multi-call LLM programs, using RadixAttention to reuse KV cache and reach up to 6.4x throughput.
The 2023 OpenAI paper showing a strong model fine-tuned on a weak model's labels can outperform its weak supervisor, a toy model for superalignment.
A 2024 Anthropic paper showed that a deceptive backdoor trained into a language model could survive standard safety training.