Google Cloud TPU v5p

On December 6, 2023, Google introduced Cloud TPU v5p alongside its AI Hypercomputer architecture, calling it the company’s most powerful and scalable AI accelerator to date. Each v5p pod links 8,960 chips, double the chip count of a TPU v4 pod, in a three-dimensional torus connected by 4,800 gigabits per second of inter-chip interconnect bandwidth per chip. Compared with v4, each v5p chip delivers roughly double the floating-point operations per second and three times the high-bandwidth memory.

Google reported that v5p trains large language models about 2.8 times faster than v4, and trains embedding-heavy models about 1.9 times faster thanks to a second generation of the SparseCore units designed for recommendation workloads. Earlier in 2023 Google had also launched the cost-optimized TPU v5e, which offered better price-performance, so the two chips represented a split strategy: v5e for efficiency and v5p for raw scale.

The v5p arrived as Google trained its Gemini family of models and positioned its custom silicon as an alternative to buying NVIDIA GPUs for the largest workloads. For a business reader, TPU v5p illustrates how a major cloud provider uses vertically integrated hardware to control cost and capacity for frontier AI, rather than depending entirely on a single chip vendor.

Sources

Related