NVIDIA's Blackwell GPU packs 208 billion transistors

When NVIDIA announced its Blackwell platform on March 18, 2024, it disclosed that each Blackwell-architecture GPU is built with 208 billion transistors, manufactured on a custom TSMC 4NP process. That is more than two and a half times the 80 billion transistors in the previous Hopper generation, and large enough that a Blackwell GPU is physically two dies joined to act as a single chip.

The B200 introduced a second-generation Transformer Engine and support for FP4, a 4-bit floating-point format that doubles inference throughput and the model size a chip can handle compared with the 8-bit FP8 of Hopper. Its fifth-generation NVLink delivers 1.8 terabytes per second of bidirectional bandwidth per GPU. NVIDIA paired two B200 GPUs with a Grace CPU to form the GB200 Superchip, and a rack-scale GB200 NVL72 system, NVIDIA said, could deliver up to a 30 times performance increase over H100 for trillion-parameter inference while cutting cost and energy use by up to 25 times.

The transistor count is a useful marker of how fast AI silicon is growing. For a general reader, it shows that even as traditional chip scaling slows, the chips driving AI keep getting dramatically larger and more capable from one generation to the next, by combining new process technology, lower-precision math, and faster interconnects.