Landmark Papers

What the papers actually said - linked to the originals.

644 entries, all primary-sourced
paper July 20, 2017

Proximal Policy Optimization (PPO)

OpenAI's 2017 PPO paper introduced a simple, stable policy-gradient method that became the default RL algorithm, including for RLHF.

paper August 16, 2017

Neural Collaborative Filtering

Replaced matrix factorization's fixed dot product with a neural network that learns user-item interactions.

paper September 5, 2017

Squeeze-and-Excitation Networks (SENet)

The 2017 paper introducing a lightweight block that lets a network reweight its feature channels by importance, winning that year's ImageNet contest.

paper October 10, 2017

Mixed Precision Training

NVIDIA and Baidu paper showing deep networks can train in 16-bit floating point at full accuracy, halving memory use.

paper October 30, 2017

Graph Attention Networks (GAT)

GAT brings self-attention to graphs, letting each node weigh its neighbors instead of treating them all equally.

paper November 11, 2017

Software 2.0 (Andrej Karpathy, 2017)

Andrej Karpathy's 2017 essay arguing neural network weights are a new kind of software that replaces hand-written code.

paper December 27, 2017

Adversarial Patch

A 2017 paper introducing a printable sticker that, placed in any scene, makes image classifiers report an attacker-chosen object.

paper 2018

Prophet: Forecasting at Scale

Taylor and Letham's Prophet, a Facebook forecasting tool that fits trend, seasonality, and holidays so analysts can produce decent forecasts without expertise.

paper January 4, 2018

Soft Actor-Critic (SAC)

The 2018 SAC paper introduced a stable, sample-efficient off-policy RL method that maximizes both reward and action entropy.

paper February 5, 2018

IMPALA: Scalable Distributed Deep-RL

The 2018 IMPALA paper introduced a distributed RL architecture with V-trace correction for high-throughput, multi-task training.

paper February 18, 2018

Trojaning Attack on Neural Networks

A 2018 NDSS paper showing an attacker can implant a trojan trigger into a trained network without access to its original training data.

paper February 26, 2018

Twin Delayed DDPG (TD3)

The 2018 TD3 paper fixed the overestimation bias that made DDPG unstable, using twin critics and delayed policy updates.

paper March 9, 2018

The Lottery Ticket Hypothesis

The 2018 Frankle-Carbin paper proposing that dense networks contain small 'winning ticket' subnetworks that train to full accuracy on their own.