AI Compute

In AI, “compute” is shorthand for the total amount of calculation spent training a model, usually measured in floating-point operations (FLOPs) or in petaflop/s-days. Along with model size and training data, it is one of the three ingredients that scaling laws identify as driving how good a model becomes. Of the three, compute has become the headline figure, because it is the one that companies buy directly with money, in the form of chips, electricity, and data center time.

The reason compute became the field’s defining input is that its growth has been extraordinary and, more importantly, measurable. OpenAI’s 2018 “AI and Compute” analysis found that the compute used in the largest training runs had been doubling every 3.4 months since 2012, a more than 300,000-fold increase, far outstripping the roughly two-year doubling of Moore’s Law. Epoch AI’s longer-run study, using openly licensed data, found training compute had grown by a factor of ten billion since 2010, with doubling times of around 6 months in the deep learning era and about 10 months in the large-scale era. Crucially, the 2022 Chinchilla result showed that compute should be balanced between model size and training data rather than spent on parameters alone.

The deeper lesson is captured in Richard Sutton’s “Bitter Lesson”: across decades of AI research, general methods that simply leverage more computation have tended to beat hand-crafted, knowledge-engineered approaches. Compute is not a substitute for good ideas, but the historical record suggests that betting on cheap, abundant calculation has repeatedly paid off where betting on cleverness alone has not.

Why business readers should care: compute is the most concrete cost driver in AI. It determines how much a frontier model costs to build, who can afford to build one, and how quickly capability advances. Treating compute as a budget line, not an abstraction, is the clearest way to reason about the trajectory and economics of the technology.

Sources

Related