Workflow
GPU和光模块的需求分析
傅里叶的猫·2025-08-29 15:33

Core Viewpoint - The article discusses the increasing demand for optical modules in AI clusters, particularly in relation to the architecture and scale of the networks used in semiconductor and AI applications [2][5][10]. Group 1: Optical Module Requirements - In Huawei's CM384 super node, the ratio of NPU to optical modules is calculated to be 1:18, requiring a total of 6,912 optical modules for 384 NPUs [4]. - The comparison between Huawei and NVIDIA's server optical module usage reveals that CM384 has a significantly higher optical module requirement, indicating a trend towards "full optical interconnection" [5]. - The demand for optical modules increases non-linearly with the scale of AI clusters, with larger clusters requiring more complex network architectures [6][10]. Group 2: Network Architecture Impact - In a small cluster of 1,024 GPUs, the ratio of optical modules to GPUs is approximately 2.5, but this jumps to 3.5 when scaling to 4,096 GPUs due to the introduction of a third layer of core switches [6][8]. - For ultra-large clusters (e.g., 100,000 GPUs), the ratio of optical modules to GPUs can reach up to 4, indicating a significant increase in network complexity [6][10]. Group 3: Cost Differences Among Solutions - Different interconnect solutions exhibit notable cost differences; for instance, NVIDIA's InfiniBand solution is the most expensive at approximately $3.9 billion, with a ratio of 3.6 optical modules per GPU [11]. - Broadcom's Ethernet solution is the most cost-effective at around $3.5 billion, with a similar optical module ratio of 2.6, saving approximately $400 million compared to InfiniBand [11]. Group 4: Future Trends - As GPU clusters continue to grow, the network architecture may evolve to four or even five layers, potentially increasing the optical module to GPU ratio from 3.5 to 4.5 [10]. - Broadcom's Ethernet solution is expected to gain traction due to its cost advantages, particularly in large-scale deployments where budget constraints are a concern [10].