Deep Reinforcement Learning from Human Preferences
The 2017 paper that trained RL agents from human comparisons of trajectory segments instead of a hand-coded reward function - the seed of RLHF.
What the papers actually said - linked to the originals.
The 2017 paper that trained RL agents from human comparisons of trajectory segments instead of a hand-coded reward function - the seed of RLHF.
The 2017 Madry paper that framed robustness as min-max optimization and made PGD adversarial training the standard defense.
Yandex researchers' 2017 paper introducing CatBoost, a gradient-boosting library that handles categorical features and reduces a subtle target-leakage bias.
OpenAI's 2017 PPO paper introduced a simple, stable policy-gradient method that became the default RL algorithm, including for RLHF.
The 2017 C51 paper proposed learning the full distribution of returns, not just their average, improving Atari agents.
The 2017 paper that fooled a vision system into misreading a real stop sign as a speed-limit sign using only black-and-white stickers.
Replaced matrix factorization's fixed dot product with a neural network that learns user-item interactions.
A network that learns explicit feature combinations automatically, built for predicting ad clicks.
The 2017 paper that introduced backdoor attacks, showing a model can behave normally yet misfire on inputs carrying a hidden trigger.
The 2017 paper introducing a lightweight block that lets a network reweight its feature channels by importance, winning that year's ImageNet contest.
The 2017 Rainbow paper combined six separate DQN improvements into one agent that set a new Atari benchmark record.
NVIDIA and Baidu paper showing deep networks can train in 16-bit floating point at full accuracy, halving memory use.
A 2017 paper showing that changing a single pixel can make an image classifier confidently give the wrong answer.
The 2017 paper introducing mixup, which trains on blended pairs of examples and labels to improve generalization and robustness.
NVIDIA's ProGAN grew generator and discriminator layer by layer to produce stable, high-resolution synthetic faces.
GAT brings self-attention to graphs, letting each node weigh its neighbors instead of treating them all equally.
A cryptographic protocol that lets a server sum thousands of devices' model updates without seeing any single device's update.
A 2017 paper showed a model could learn to translate between two languages using only unpaired text, no parallel sentences at all.
Brynjolfsson, Rock and Syverson explain why AI's promise and flat productivity statistics can coexist: implementation lags.
VQ-VAE learned discrete latent codes via vector quantization, a building block for later image and audio generation.
A survey of deep-learning dialogue systems that frames the field's main split between task-oriented and open-domain chatbots.
Andrej Karpathy's 2017 essay arguing neural network weights are a new kind of software that replaces hand-written code.
Stanford's 2017 CheXNet paper trained a 121-layer CNN on 100,000-plus chest X-rays and reported exceeding radiologists at detecting pneumonia.
The 2017 paper showing weight decay and L2 regularization differ for Adam, and introducing AdamW, now standard for training large models.
DeepMind's 2017 method that jointly trains a population of models and evolves their hyperparameters into adaptive schedules during training.
DeepMind's 2017 Parallel WaveNet distilled the slow WaveNet into a parallel network fast enough to ship in Google Assistant.
The Stanford system that builds training labels from noisy heuristics ('labeling functions') instead of hand-labeling, then denoises them automatically.
The 2018 Kim paper that measures how much a human-defined concept like 'stripes' influences a neural network's prediction.
Microsoft's 2017 NeurIPS paper introducing LightGBM, a gradient-boosting library that trains far faster on large datasets using GOSS and EFB.
The 2017 paper that used a convolutional neural network to find Kepler-90i and Kepler-80g in Kepler light curves, published in The Astronomical Journal.
Google's 2017 Tacotron 2 reached a 4.53 naturalness score, near the 4.58 of professionally recorded human speech.
The 2017 paper introducing Ray, a distributed execution engine for AI workloads that scaled past 1.8 million tasks per second on a single unified interface.
A 2017 paper introducing a printable sticker that, placed in any scene, makes image classifiers report an attacker-chosen object.
Acemoglu and Restrepo's task framework showing automation displaces labor but new labor-intensive tasks can reinstate it.
Taylor and Letham's Prophet, a Facebook forecasting tool that fits trend, seasonality, and holidays so analysts can produce decent forecasts without expertise.
Gary Marcus's 2018 paper lists ten limits of deep learning and argues it must be combined with symbolic methods.
The 2018 SAC paper introduced a stable, sample-efficient off-policy RL method that maximizes both reward and action entropy.
The 2018 fast.ai paper showing a pre-trained language model could be fine-tuned to any NLP task, bringing transfer learning to language.
The study that measured how badly commercial face-analysis tools misread darker-skinned women compared with lighter-skinned men.
The 2018 IMPALA paper introduced a distributed RL architecture with V-trace correction for high-throughput, multi-task training.
McInnes, Healy, and Melville's 2018 paper introducing UMAP, a fast dimensionality-reduction method that rivals t-SNE and keeps more global structure.
The 2018 NAACL paper introducing ELMo, word vectors that change with context, ending the one-fixed-vector-per-word era of word2vec and GloVe.
Uber's 2018 paper introducing Horovod, which uses ring-allreduce to scale data-parallel training across many GPUs with only a few lines of code change.
A 2018 NDSS paper showing an attacker can implant a trojan trigger into a trained network without access to its original training data.
Google's 2018 study showed deep learning could read age, sex, smoking status, and blood pressure from retinal photos, hinting at non-invasive risk screening.
DeepMind's 2018 WaveRNN compressed neural vocoding into a small recurrent network fast enough to synthesize speech on a mobile CPU.
The 2018 TD3 paper fixed the overestimation bias that made DDPG unstable, using twin critics and delayed policy updates.
The 2018 Frankle-Carbin paper proposing that dense networks contain small 'winning ticket' subnetworks that train to full accuracy on their own.