A Learning Algorithm for Boltzmann Machines

“A Learning Algorithm for Boltzmann Machines” by David H. Ackley, Geoffrey E. Hinton, and Terrence J. Sejnowski appeared in the journal Cognitive Science in 1985, volume 9, pages 147 to 169. The paper introduced a learning rule for a class of stochastic neural networks the authors named Boltzmann machines, after the physicist Ludwig Boltzmann whose statistical mechanics supplied the underlying mathematics.

A Boltzmann machine is a network of binary units that turn on and off probabilistically rather than deterministically. The whole network settles into states with a probability governed by an energy function, the same idea John Hopfield had used in 1982, but here the units are noisy. Some units are visible and clamped to data; others are hidden and free to represent structure the data does not state directly. The machine learns by adjusting connection weights so that the patterns it produces on its own match the statistics of the patterns it is shown.

The contribution that mattered was the learning rule itself. The authors showed that the weight change needed only the difference between two correlations: how often two units were on together while the network ran “awake” clamped to data, and how often they were on together while it ran “asleep” free-running. This local rule, requiring only locally available information, made it possible in principle to train networks with hidden units, a problem the 1969 critique of perceptrons had left open.

The Boltzmann machine was slow and hard to scale in practice, but its ideas propagated forward. Restricted Boltzmann machines later became the building blocks of the deep belief networks that helped restart deep learning in 2006, and the line of work was cited when Hinton shared the 2024 Nobel Prize in Physics.