On March 13, 2024, Cerebras announced the Wafer-Scale Engine 3, or WSE-3, its third-generation AI processor and the largest chip ever built. Where conventional chipmakers cut a silicon wafer into many small dies, Cerebras keeps an entire wafer as a single processor. The WSE-3 contains 4 trillion transistors and 900,000 AI-optimized cores, delivering 125 petaflops of peak AI performance, and Cerebras said it doubled the performance of the previous WSE-2 at the same power and price.
The chip is the heart of the Cerebras CS-3 system, which the company positioned for training very large models without the complexity of stitching together thousands of separate GPUs. Cerebras stated a single CS-3 could in principle support models up to 24 trillion parameters, and that clusters of up to 2,048 systems could reach 256 exaflops of aggregate performance. The wafer-scale approach concentrates enormous on-chip memory and bandwidth in one device, directly attacking the data-movement bottleneck that slows distributed GPU systems.
Building a working chip the size of a dinner plate requires engineering around manufacturing defects, which Cerebras handles with redundant cores and routing that fail in place. For a general reader, the WSE-3 is the most visible example of a contrarian bet in AI hardware: that the way to beat the bottlenecks of large-scale AI is to make one gigantic chip rather than connect many small ones.