Score-Based Generative Modeling through Stochastic Differential Equations
Song and colleagues unified diffusion and score-based generation under a continuous stochastic differential equation framework.
What the papers actually said - linked to the originals.
Song and colleagues unified diffusion and score-based generation under a continuous stochastic differential equation framework.
A 2020 paper that recovered verbatim training text, including personal data, from GPT-2 just by querying it, proving language models memorize.
Zhou and colleagues' Informer, a transformer variant with sparse attention that makes forecasting very long sequences computationally practical.
A 2021 method that adapts a frozen language model by training only a small continuous prefix, tuning about 0.1 percent of parameters.
The 2021 paper that simplified mixture-of-experts by routing each token to a single expert, scaling Transformers to a trillion parameters efficiently.
Google's ALIGN trained a contrastive dual-encoder on over a billion noisy image alt-text pairs, skipping the expensive curation step.
EGNN is a graph network equivariant to rotations, translations, and reflections without costly higher-order representations.
The 2021 report of the US National Security Commission on AI warned the country was not ready to defend or compete in the AI era.
The paper that warned about the environmental, social, and bias costs of ever-larger language models, and helped trigger a high-profile firing.
Sam Altman's 2021 essay predicting AI will drive the cost of goods toward zero and proposing a wealth fund paid to every citizen.
The 2021 Nature Machine Intelligence paper introducing DeepONet, a network that learns operators between function spaces from data.
The 2021 Microsoft paper that gave vision transformers a hierarchy and shifted local windows, making them efficient general-purpose vision backbones.
A platform where humans write examples that fool the model, making benchmarks dynamic and harder to saturate.
A 2021 Google paper showing that learned soft prompts rival full fine-tuning as models grow into the billions of parameters.
The 2021 Su et al. paper introducing rotary position embeddings, the position-encoding scheme now used in Llama, GPT-NeoX, and most modern LLMs.
Bronstein and colleagues frame CNNs, GNNs, and Transformers as instances of one geometric principle: building in symmetry.
The 2021 Decision Transformer paper recast reinforcement learning as sequence modeling, using a Transformer to predict actions.
The 2021 Trajectory Transformer paper modeled whole RL trajectories with a Transformer and used beam search as a planner.
HuBERT learned speech representations by clustering audio into pseudo-labels and predicting masked ones, a BERT-style recipe for sound.
The 2021 OpenAI paper that introduced Codex and the HumanEval benchmark and described the model behind GitHub Copilot.
Google's 2021 SoundStream compressed speech and music at very low bitrates with a learned codec and residual vector quantization.
Neural retriever that outputs sparse, term-based vectors, blending keyword-style exact matching with learned semantic expansion.
A 2021 study showing that web training corpora are full of duplicates, and that removing them cuts memorization tenfold and speeds up training.
Baker lab's 2021 Science paper introduced RoseTTAFold, a three-track network that predicted protein structures and complexes nearly as well as AlphaFold 2.
The 2021 Liu et al. survey that named and organized the prompting paradigm as a third way alongside pre-train-and-fine-tune.
NVIDIA's Isaac Gym ran robot physics and policy training together on the GPU, cutting RL training time by two to three orders of magnitude.
Positional method that biases attention by token distance, letting a model handle longer inputs than it was trained on.
The 2021 Google paper that introduced instruction tuning, fine-tuning a model on many tasks phrased as instructions to boost zero-shot performance.
By simulating thousands of robots at once on a single GPU, this work trained a quadruped walking policy in minutes instead of days.
Acemoglu and Restrepo linked 50 to 70 percent of the change in the US wage structure since 1980 to automation displacing routine-task workers.
A 2021 DeepMind preprint extending AlphaFold to predict how multiple protein chains assemble into complexes.
The 2021 EfficientZero paper reached above human-level Atari with only two hours of gameplay, a huge gain in sample efficiency.
S4 made state space models practical for very long sequences, outperforming Transformers on extreme long-range tasks.
The 2021 Meta paper that pretrained vision transformers by masking most of an image and reconstructing it, a simple, scalable self-supervised recipe.
Argues popular AI benchmarks lack the construct validity to stand for general progress toward flexible AI.
A 2021 Nature paper showed machine learning could surface patterns that led mathematicians to new theorems in knot theory and representation theory.
Retrieval model that keeps a vector per token for fine-grained matching, then compresses them to make the approach storage-practical.
DeepMind model that matches GPT-3 quality using a fraction of the parameters by retrieving from a 2-trillion-token database.
OpenAI's 2021 paper teaching GPT-3 to search and browse the web to answer questions, an early ancestor of browser agents.
OpenAI's GLIDE showed text-guided diffusion could beat the original DALL-E and edit images from natural-language instructions.
The 2021 paper that moved diffusion into a compressed latent space and became the architecture behind Stable Diffusion.
The 2021 Anthropic paper that reverse-engineered small attention-only transformers and named induction heads, launching the circuits agenda.
The 2022 OpenAI paper documenting grokking, where a network keeps memorizing for a long time and then suddenly generalizes perfectly far past overfitting.
The 2022 Meta paper that modernized a plain ResNet step by step until it matched vision transformers, showing convolutions were not obsolete.
NVIDIA's 2022 instant-ngp paper cut neural radiance field training from hours to seconds with a multiresolution hash encoding.
The 2022 Wei et al. paper showing that prompting an LLM to show its reasoning steps sharply improves its accuracy on arithmetic and reasoning tasks.
Meta's 2022 data2vec used one self-supervised recipe across speech, images, and text by predicting the model's own latent targets.
The 2022 DeepMind paper that used one language model to automatically generate adversarial prompts and surface harmful behavior in another.