TPU v4
Search documents
AI算力竞赛升级,谷歌发布下代Ironwood TPU架构,性能暴增16倍,单芯片算力达4614 TFLOPs
Hua Er Jie Jian Wen· 2025-08-25 12:42
Core Insights - The AI infrastructure arms race is accelerating, with Google's latest TPU platform, Ironwood, setting a new benchmark in performance [1][4] - Ironwood's seventh-generation TPU architecture boasts a peak performance of 4614 TFLOPs, representing over a 16-fold increase compared to the TPU v4 launched in 2022 [5][8] Performance Leap - Ironwood's single-chip peak performance reaches 4614 TFLOPs, equipped with 192 GB of high-bandwidth memory (HBM) and a bandwidth of 7.4 TB/s [5] - In comparison, the TPU v4 released in 2022 had a performance of 275 TFLOPs with 32 GB HBM and 1.2 TB/s bandwidth, while the TPU v5p from 2023 had 459 TFLOPs, 95 GB HBM, and 2.8 TB/s bandwidth [5][8] System Architecture - The Ironwood platform is designed as a modular and scalable system, integrating the Ironwood SoC chip into a comprehensive architecture that includes racks and clusters [11] - An Ironwood TPU rack consists of 64 chips, with 16 PCBA motherboards stacked together, utilizing a 3D Torus network topology for efficient interconnectivity [14] Scalability and Cluster Design - The system can connect up to 43 computing units, each containing 64 chips, forming a massive cluster with a network bandwidth of 1.8 Petabytes [14] - The Ironwood Superpod will include 9216 chips, further expanding the scale compared to previous generations [8] Energy Consumption and Cooling Solutions - A fully loaded Ironwood rack can exceed 100 kW in power consumption, necessitating advanced power and cooling solutions [17] - Google has implemented an efficient liquid cooling system for the Ironwood racks, including a CBU rack for coolant distribution and a leak detection system [17]