A Learning Algorithm for Boltzmann Machines
The 1985 paper by Ackley, Hinton, and Sejnowski that gave a learning rule to a stochastic neural network borrowed from statistical physics.
What the papers actually said - linked to the originals.
The 1985 paper by Ackley, Hinton, and Sejnowski that gave a learning rule to a stochastic neural network borrowed from statistical physics.
The 1986 two-volume PDP books set out the modern connectionist program and popularized learning internal representations in neural networks.
The 1986 Nature paper by Rumelhart, Hinton, and Williams that popularized backpropagation, the algorithm that lets multi-layer neural networks learn.
The 1986 White paper that mapped every neuron and connection of a worm, the first complete connectome of any animal's nervous system.
The 1987 NETtalk paper by Sejnowski and Rosenberg, a neural network that learned to turn written English into speech sounds.
The 1987 paper by Laird, Newell, and Rosenbloom set out Soar, a production-rule architecture meant as a unified theory of cognition.
Richard Sutton's 1988 paper introduced temporal-difference learning, the prediction method that became a pillar of reinforcement learning.
Cybenko's 1989 proof that a neural network with a single hidden layer can approximate any continuous function.
Carver Mead's 1990 paper coined neuromorphic engineering, arguing analog VLSI modeled on neurons could compute far more efficiently than digital logic.
The 1991 MIT paper that recognized faces by projecting them onto eigenfaces, the principal components of a set of face images.
Rodney Brooks's manifesto for behavior-based robotics, arguing intelligent systems need no central world model: the world is its own best model.
Harnad's 1991 paper proposing the Total Turing Test, which requires robotic sensorimotor ability, not just conversation, to show a mind.
Watkins and Dayan's 1992 paper proved that Q-learning converges to optimal action values, giving model-free RL a firm guarantee.
Ronald Williams's 1992 REINFORCE paper gave reinforcement learning a way to improve a policy directly by following the gradient of expected reward.
The 1993 IBM paper recast translation as probability and word alignment, founding statistical machine translation and its five IBM Models.
The Robertson-Walker ranking function that became the default keyword-search baseline for thirty years.
Carnegie Mellon's Cognitive Tutors used the ACT-R theory of cognition to build math tutoring software that reached real classrooms.
David Chalmers's 1995 paper that split the 'easy problems' of mind from the 'hard problem' of why experience exists at all.
Tibshirani's 1996 paper introducing the lasso, an L1 penalty that shrinks regression coefficients and sets some to exactly zero.
Leo Breiman's 1996 paper introducing bagging, which builds many models on bootstrap samples and averages them to cut variance.
Schultz, Dayan, and Montague showed dopamine neurons signal reward prediction error, the same quantity that drives temporal-difference learning.
Wolpert and Macready's 1997 paper proving that no optimization algorithm beats all others when averaged over every possible problem.
Freund and Schapire's 1997 paper introducing AdaBoost, the algorithm that combines many weak rules into one strong classifier.
The 1998 paper that applied a naive Bayes classifier to spam, a foundational use of machine learning for cybersecurity.
Nick Bostrom's 1998 paper defined superintelligence and argued the hardware to build it would arrive in the early 21st century.
PageRank ranked web pages by treating links as votes, modeling a random surfer to measure each page's importance.
Sutton and Barto's textbook became the standard reference for reinforcement learning and is given away free by the authors online.
The 1998 paper by LeCun and colleagues that introduced LeNet-5, the convolutional neural network that read handwritten digits and shaped modern computer vision.
Rao and Ballard modeled vision as a hierarchy where higher areas predict lower-level activity and only the prediction errors flow upward.
Hutter's 2000 paper that fuses Solomonoff induction with decision theory to define AIXI, a mathematically optimal but uncomputable agent.
The 2000 paper framing learning as compressing an input while keeping what is relevant to a target, later applied to deep networks.
Bringsjord, Bello, and Ferrucci's 2001 paper proposing a creativity-based alternative to the Turing Test for machine minds.
Jerome Friedman's 2001 paper that formalized gradient boosting as stagewise function fitting via gradient descent, the basis for XGBoost and its successors.
Leo Breiman's 2001 paper introducing random forests, an ensemble of randomized decision trees that became a default workhorse classifier.
Latanya Sweeney's model requiring each released record to be indistinguishable from at least k-1 others, an early formal privacy standard.
The 2002 BLEU paper gave machine translation a cheap automatic score, and it became the field's default metric for two decades.
The foundational paper that learned distributed word representations with a neural net to fight the curse of dimensionality.
The 2003 LDA paper introduced topic modeling, a way to discover the hidden themes running through a collection of documents.
The 2003 phrase-based translation paper showed translating chunks of words, not single words, sharply improved statistical machine translation.
David Lowe's 2004 paper defining SIFT, features invariant to scale and rotation that dominated image matching before deep learning.
The 2004 ACL paper introducing NLTK, the open-source Python toolkit that taught a generation how to do natural language processing.
Tononi's 2004 paper proposing that consciousness is the capacity of a system to integrate information, measured by a quantity called phi.
Knill and Pouget's review argued the brain represents uncertainty and combines evidence in a near-optimal Bayesian way during perception and action.
The 2005 paper introducing the HOG descriptor, a pre-deep-learning feature that became the standard for pedestrian and object detection.
METEOR, introduced in 2005, scored machine translation by matching word stems and synonyms, correlating with human judgment better than BLEU.
The Microsoft paper that introduced RankNet and helped make machine-learned ranking the basis of modern search.
The 2006 Rasmussen and Williams book that became the standard reference for Gaussian process models in machine learning.
The foundational differential privacy paper, showing how to add noise scaled to a query's sensitivity to protect any single individual.