On August 19, 2019, at the Hot Chips conference, Cerebras Systems unveiled the Wafer-Scale Engine (WSE), which it called the industry’s first trillion-transistor chip and “the largest chip ever built.” Where a conventional processor is cut from a silicon wafer into many small dies, Cerebras kept an entire wafer as a single chip: 1.2 trillion transistors and 400,000 AI-optimized cores across 46,225 square millimeters of silicon, with 18 gigabytes of on-chip SRAM and memory bandwidth measured in petabytes per second.
The bet behind the WSE was that the biggest bottleneck in training large neural networks is moving data between many separate chips. By putting an enormous number of cores and a large pool of fast memory onto one piece of silicon, Cerebras aimed to keep more of the computation - and the communication between cores - on a single device, avoiding the slow links that connect racks of GPUs.
The WSE was a deliberate counterbet against the GPU cluster model that NVIDIA dominates and that Google’s TPU pods also follow. Cerebras packaged the chip into its CS-1 system and later generations, targeting research labs and enterprises training large models.
Why business readers should care: the Wafer-Scale Engine shows that there is no settled answer to how AI hardware should be built. While most of the industry scales by wiring together thousands of GPUs, a well-funded startup bet the opposite way - one giant chip - illustrating that the architecture of AI compute is still an open and contested market.