Landmark Papers

What the papers actually said - linked to the originals.

644 entries, all primary-sourced

paper February 8, 2022

Survey of Hallucination in Natural Language Generation

The 2022 Ji et al. survey that organized the study of hallucination across NLP tasks and became a standard reference.

paper February 16, 2022

Magnetic control of tokamak plasmas through deep reinforcement learning

A 2022 Nature paper by DeepMind and EPFL used reinforcement learning to control the magnetic coils of a real tokamak and sculpt fusion plasma shapes.

paper March 4, 2022

Training Language Models to Follow Instructions with Human Feedback (InstructGPT)

The 2022 OpenAI paper behind InstructGPT, showing a 1.3B model tuned with human feedback was preferred over the 175B GPT-3 - the recipe behind ChatGPT.

paper March 8, 2022

In-context Learning and Induction Heads

The 2022 Anthropic paper identifying induction heads, an attention circuit that appears to drive in-context learning in transformers.

paper March 21, 2022

Self-Consistency Improves Chain of Thought Reasoning in Language Models

The 2022 Wang et al. method that samples many reasoning paths and takes a majority vote on the final answer.

paper April 4, 2022

SayCan: Grounding Language in Robotic Affordances

SayCan paired a language model's task knowledge with robot skill value functions so a robot only attempts steps it can actually do.

paper April 13, 2022

Hierarchical Text-Conditional Image Generation with CLIP Latents (DALL-E 2 / unCLIP)

The 2022 OpenAI paper behind DALL-E 2, generating images by inverting CLIP embeddings through a prior and a diffusion decoder.

paper April 29, 2022

Flamingo: a Visual Language Model for Few-Shot Learning

DeepMind's Flamingo bridged frozen vision and language models so one model handled new image tasks from a few examples.

paper May 9, 2022

NaturalSpeech: End-to-End Text to Speech with Human-Level Quality

A 2022 Microsoft TTS system that, on a standard benchmark, produced speech statistically indistinguishable from human recordings.

paper May 21, 2022

Least-to-Most Prompting Enables Complex Reasoning in Large Language Models

The 2022 Zhou et al. paper that solves hard problems by decomposing them into simpler subproblems solved in sequence.

paper May 23, 2022

Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding (Imagen)

Google's 2022 Imagen paper showed a frozen text-only language model is a surprisingly strong text encoder for image generation.

paper May 24, 2022

Large Language Models are Zero-Shot Reasoners (Let's think step by step)

The 2022 Kojima et al. paper showing the phrase 'Let's think step by step' triggers multi-step reasoning with zero examples.

paper May 25, 2022

Autoformalization with Large Language Models

Showed LLMs can translate natural-language math into formal proofs, perfectly converting a quarter of competition problems.

paper May 26, 2022

Matryoshka Representation Learning

Training method that packs coarse-to-fine detail into one embedding, so you can truncate it to shorter vectors without retraining.

paper May 27, 2022

FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness

The 2022 paper that sped up Transformer attention by minimizing GPU memory traffic rather than approximating, enabling far longer context windows.

paper June 15, 2022

Emergent Abilities of Large Language Models

The 2022 Wei et al. paper arguing that some LLM capabilities appear suddenly at scale, absent in small models and unpredictable from their trends.

paper July 5, 2022

TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second

Hollmann and colleagues' TabPFN, a pre-trained transformer that classifies small tabular datasets in seconds with no training or tuning.

paper July 11, 2022

Orca: A Distributed Serving System for Transformer-Based Generative Models

OSDI 2022 system that introduced iteration-level scheduling (continuous batching) to serve generative transformers far more efficiently.

paper July 18, 2022

Why Do Tree-Based Models Still Outperform Deep Learning on Tabular Data?

Grinsztajn, Oyallon, and Varoquaux's careful benchmark showing tree ensembles still beat deep learning on medium-sized tabular data, and explaining why.

paper July 26, 2022

Classifier-Free Diffusion Guidance

Classifier-free guidance let diffusion models follow a text prompt more closely without needing a separate classifier.

paper July 27, 2022

ProtGPT2: a deep unsupervised language model for protein design

A 2022 Nature Communications paper showing a GPT-style language model can generate realistic, novel protein sequences from scratch.

paper August 15, 2022

LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale

The 2022 paper behind bitsandbytes, halving LLM memory with 8-bit math while keeping full accuracy by isolating outlier features.

paper August 25, 2022

DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation

DreamBooth, the 2022 Google method that teaches a diffusion model a specific subject from just a few photos using a unique token.

paper September 7, 2022

AudioLM: a Language Modeling Approach to Audio Generation

Google's 2022 AudioLM treated audio as a string of tokens and generated speech and piano continuations by predicting the next one.

paper September 7, 2022

Rectified Flow: Flow Straight and Fast

Rectified flow learns straight transport paths between noise and data, enabling fast, even single-step, generation.

paper September 12, 2022

FP8 Formats for Deep Learning

NVIDIA, Arm, and Intel jointly propose two 8-bit floating-point formats for AI, the precision behind Hopper and Blackwell.

paper September 14, 2022

Toy Models of Superposition

The 2022 Anthropic paper showing neural networks pack more features than they have neurons by storing them in superposition.

paper September 29, 2022

DreamFusion: Text-to-3D using 2D Diffusion

DreamFusion generated 3D objects from text by optimizing a NeRF against a frozen 2D image diffusion model, with no 3D data.

paper September 29, 2022

Make-A-Video: Text-to-Video Generation without Text-Video Data

Meta's Make-A-Video generated video from text by learning appearance from image-text pairs and motion from unlabeled video.

paper October 6, 2022

Flow Matching for Generative Modeling

Flow matching gave a simple, simulation-free way to train continuous normalizing flows, rivaling diffusion in quality.

paper October 6, 2022

ReAct: Synergizing Reasoning and Acting in Language Models

The 2022 paper that interleaved reasoning traces with tool actions, a recipe that became the backbone of modern LLM agents.

paper October 24, 2022

EnCodec: High Fidelity Neural Audio Compression

Meta's 2022 EnCodec compressed audio into discrete tokens with a neural codec, becoming the token layer for later audio language models.

paper October 26, 2022

Broken Neural Scaling Laws

The 2022 paper proposing a smoothly-broken power law that fits and extrapolates scaling behavior, including double descent and sharp jumps.

paper October 31, 2022

GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers

Method to compress large language model weights to 3-4 bits after training, letting big models run on a single GPU.

paper November 4, 2022

Measuring Progress on Scalable Oversight for Large Language Models

The 2022 Anthropic paper proposing the sandwiching method to study how humans can supervise AI on tasks the AI handles better than they do.

paper November 18, 2022

PAL: Program-Aided Language Models

The 2022 Gao et al. paper that has a language model write code as its reasoning steps and offloads the calculation to a Python interpreter.

paper November 30, 2022

Fast Inference from Transformers via Speculative Decoding

The 2022 Google paper introducing speculative decoding, which uses a small draft model to make a large model generate text 2-3x faster with identical output.

paper December 7, 2022

E5: Text Embeddings by Weakly-Supervised Contrastive Pre-training

Microsoft embedding family trained with contrastive learning on web pairs; first to beat BM25 zero-shot on the BEIR benchmark.

paper December 13, 2022

RT-1: Robotics Transformer for Real-World Control at Scale

Google's RT-1 trained a transformer on 130,000 robot demonstrations covering 700 tasks to control real robots at scale.

paper December 15, 2022

Constitutional AI: Harmlessness from AI Feedback

The 2022 Anthropic paper introducing Constitutional AI, which trains a harmless model using AI feedback guided by a written set of principles.

paper December 19, 2022

Discovering Language Model Behaviors with Model-Written Evaluations

The 2022 Anthropic paper that used language models to write 154 test datasets, revealing sycophancy and goal-seeking that grow with scale and RLHF.

paper December 19, 2022

Scalable Diffusion Models with Transformers (DiT)

Peebles and Xie replaced the U-Net in diffusion models with a transformer, showing that more compute reliably lowers error.

paper December 20, 2022

Precise Zero-Shot Dense Retrieval without Relevance Labels (HyDE)

The 2022 Gao et al. paper that retrieves documents by first having a model write a fake answer, then embedding and matching it.

paper December 24, 2022

GraphCast: learning skillful medium-range global weather forecasting

DeepMind's 2022 paper on GraphCast, a graph neural network that forecasts global weather faster and more accurately than the leading conventional system.

paper January 5, 2023

Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers (VALL-E)

Microsoft's 2023 VALL-E cloned a voice from a 3-second sample by treating text-to-speech as language modeling over codec tokens.

paper January 10, 2023

DreamerV3: Mastering Diverse Domains through World Models

The 2023 DreamerV3 paper used one fixed configuration to beat specialized agents across 150+ tasks and mine Minecraft diamonds.

paper January 10, 2023

Generative Language Models and Automated Influence Operations

A 2023 report by OpenAI, Stanford, and Georgetown on how language models could change online propaganda and how to blunt it.

paper January 24, 2023

A Watermark for Large Language Models

A 2023 paper that embeds a hidden, statistically detectable signal in LLM text so machine-generated output can be identified.