Hopfield Networks is All You Need

“Hopfield Networks is All You Need,” submitted to arXiv on July 16, 2020 by a large team led by Hubert Ramsauer and including Sepp Hochreiter, modernized the classical Hopfield network and tied it directly to the attention mechanism that powers Transformers. The title deliberately echoes the 2017 “Attention Is All You Need” paper.

Classical Hopfield networks, from the 1980s, are associative memories: they store patterns and retrieve a complete one from a partial or noisy cue, but they can only hold a limited number of patterns. The authors introduced a continuous-state version with a new energy function and update rule. This modern Hopfield network can store exponentially many patterns relative to the dimension of its space, retrieves a stored pattern in a single update step, and has exponentially small retrieval errors.

The striking theoretical result is that this new update rule is mathematically equivalent to the attention mechanism used in Transformers. In other words, the self-attention layer that drives modern language models can be understood as a step of associative memory retrieval. This reinterpretation lets the network be used as a drop-in layer, called a Hopfield layer, in deep learning models.

The authors demonstrated state-of-the-art results on tasks including multiple instance learning, immune repertoire classification, and drug design. For a general reader, the paper is a satisfying unification: it shows that a decades-old idea about how memory works and the newest engine of AI are, under the surface, the same mathematics.

Hopfield Networks is All You Need

Sources

Related