The Information Bottleneck Method

“The Information Bottleneck Method,” by Naftali Tishby, Fernando C. Pereira, and William Bialek, was posted to arXiv on April 24, 2000. It proposed a clean information-theoretic way to think about what it means to extract the relevant content of a signal, an idea that decades later was taken up as a candidate explanation for how deep networks learn.

The setup is two related signals, X and Y. The goal is to find a compressed representation of X, a kind of short code, that throws away as much of X as possible while preserving as much information as possible about Y. The tension between squeezing the code small and keeping it predictive is the “bottleneck.” The authors showed this generalizes rate-distortion theory from classical information theory, with the crucial difference that the measure of what counts as important distortion is not imposed by hand but emerges from the joint statistics of X and Y. They derived self-consistent equations for the optimal code and a convergent algorithm, extending the classic Blahut-Arimoto method, to compute it.

Tishby later argued that training a deep network can be read through this lens: the network first fits the data and then, in a longer second phase, compresses its internal representation, discarding input detail irrelevant to the label. That interpretation remains debated, but it sparked a productive line of work connecting information theory to deep learning.

For a general reader, the information bottleneck offers an appealing one-line definition of learning a good representation: keep what helps you predict, forget the rest.

Sources

Last verified June 7, 2026