On the Measure of Intelligence

“On the Measure of Intelligence” is a 2019 paper by Francois Chollet, the creator of the Keras deep-learning library. Its argument is that the field has been measuring the wrong thing. Most AI benchmarks reward task-specific skill - how well a system plays a game or labels images - but skill can be bought with enormous amounts of training data and compute. A system that has seen millions of examples and then performs well has not necessarily demonstrated intelligence; it may simply have memorized a narrow region of the world.

Chollet proposes redefining intelligence as skill-acquisition efficiency: how quickly and effectively an agent learns to handle new tasks it was not prepared for, given limited experience and minimal built-in prior knowledge. Drawing on algorithmic information theory, he frames the measure around four factors - scope of tasks, the inherent difficulty of generalizing, the priors a system starts with, and the experience it is given. Under this view, a fair comparison controls for how much the system already knew and how much practice it got, and credits the system that needed the least to adapt the most.

To make the idea concrete he introduced the Abstraction and Reasoning Corpus (ARC), a benchmark of visual puzzles built on priors that humans are assumed to have innately. Each task gives only a handful of examples, so brute-force training does not help; the system has to infer the rule and apply it to a new case. ARC was deliberately hard for the systems of its day and became a closely watched test of fluid, human-like generalization, later running as a public competition.

Why business readers should care: a model that aces a benchmark may have absorbed the answers rather than learned to reason. Asking how efficiently a system adapts to genuinely new problems is a better guide to whether it will hold up outside the cases it was trained on.

Sources

Last verified June 7, 2026