In June 2014, at the International Symposium on Computer Architecture, Microsoft Research presented Project Catapult, a system that wired field-programmable gate arrays into ordinary datacenter servers to accelerate large-scale services. An FPGA is a chip whose logic can be reconfigured after manufacture, letting it be molded to a specific computation rather than running general-purpose instructions like a CPU. Catapult put one FPGA in each of 1,632 servers and linked them into a reconfigurable fabric.
The flagship demonstration was Bing web search. Microsoft reported that, under high load, the FPGA fabric improved the ranking throughput of each server by 95 percent while raising power consumption by less than 30 percent and adding little to per-server cost. By early 2015 Bing began deploying the technology in a production datacenter. Catapult’s lead, Doug Burger, predicted that within a decade it would be common to compile applications into a mix of programmable hardware and programmable software - a way to keep gaining performance as Moore’s Law slowed.
Catapult mattered as a turning point in thinking. For decades, faster general-purpose CPUs had made specialized hardware unnecessary for most workloads. As single-thread CPU gains stalled, the big cloud operators began building or adopting custom silicon - FPGAs, then dedicated AI accelerators. The same logic that drove Catapult drove Microsoft’s later Project Brainwave for real-time AI inference on FPGAs, and ran parallel to Google’s decision to build the Tensor Processing Unit.
This is an AI-infrastructure milestone more than an AI-model one. It marks the moment a hyperscaler publicly committed to reshaping its datacenters around accelerators, the trend that now defines the economics of running large models.