灵衢协议
Search documents
超节点与Scale up网络行业报告:谷歌、AMD、国产超节点持续发力,打破英伟达独大格局
Sou Hu Cai Jing· 2026-03-06 01:55
Core Insights - The report discusses the rapid development of the supernode and Scale-up network industry, highlighting the competitive landscape involving Nvidia, Google, AMD, and Huawei, which is challenging Nvidia's dominance in the market [2][22]. Group 1: Nvidia - Nvidia maintains a leading advantage in supernode technology through NVLink and NVLink Switch, with plans to release advanced solutions like GH200 NVL72 and GB200/GB300 NVL72 between 2024 and 2025, expecting a shipment of approximately 2,800 units by 2025 [4][40]. - The NVLink architecture is designed for high bandwidth and low latency data transmission, with NVLink 5 Switch achieving a single GPU-to-GPU bandwidth of 1,800 GB/s and a total bandwidth of 130 TB/s for 72 GPUs by 2025 [5][40]. - Future developments include the introduction of NVSwitch Gen6 and Gen7, which will further enhance GPU-to-GPU communication bandwidth to 3.6 TB/s [5]. Group 2: Huawei - Huawei is working on the Lingqu protocol, which is transitioning to an open standard, although it has not yet gained widespread acceptance in the domestic industry [6]. - The Atlas 950 supernode, expected to launch in Q4 2026, will feature a total computing power of 8 EFLOPS (FP8) and a memory capacity of 1,152 TB, significantly surpassing Nvidia's offerings [7]. - Huawei's approach involves a hybrid design of copper and optical interconnects to balance complexity, reliability, and power consumption while maintaining system scalability [7]. Group 3: Google - Google has established a mature optical interconnect supernode with its TPU series, including TPU v4, TPU v5p, and TPU v7, which are set to be released between 2023 and 2025 [8]. - The TPU v7 will be utilized by Anthropic, which plans to procure nearly 1 million TPU v7 Ironwood AI chips for deployment in its data centers [8]. - Google's competitive edge lies in its unique application of optical circuit switches (OCS) in Scale-up networks, creating a significant technological barrier against competitors [9]. Group 4: AMD - AMD's UALink has emerged as an important open standard, with the first version released in January 2025 and a second version expected in 2026, gaining widespread industry support [10]. - The Helios supernode is positioned as a strong competitor to Nvidia's NVL72 series, featuring a dual-width rack design that allows for future scalability without redesigning infrastructure [10]. - The Helios rack is anticipated to become a mainstream choice in the industry, with significant advantages in power consumption compared to Nvidia's offerings [10].
通信:超节点与Scale up网络行业:谷歌、AMD、国产超节点持续发力,打破英伟达独大格局
Dongxing Securities· 2026-03-03 00:24
Investment Rating - The report maintains a "Positive" outlook on the supernode and Scale-up network industry, highlighting its rapid development and potential as a key infrastructure for AI applications [2]. Core Insights - The supernode and Scale-up network are critical infrastructures that break through computing and communication bottlenecks, supporting trillion-level large models and high real-time applications. The report analyzes the progress and advantages of leading AI computing chip manufacturers, including NVIDIA, Google, AMD, and Huawei, in this field [4][24]. Summary by Sections 1. NVIDIA - NVIDIA's leading advantage in supernode technology is based on NVLink and NVLink Switch. The company plans to launch several mature supernode solutions, including GH200 NVL72 and GB200/GB300 NVL72, with an expected shipment of approximately 2,800 units by 2025 [5][6]. - The NVLink technology enables high bandwidth and low latency data transmission, with NVLink 5 Switch supporting a single GPU-to-GPU bandwidth of 1,800 GB/s and a total bandwidth of 130 TB/s for 72 GPUs [6][40]. - Future developments include the introduction of the Vera Rubin NVL144 and Rubin Ultra NVL576, which will increase the number of interconnected GPUs from 72 to 576 [5][6]. 2. Huawei - Huawei has introduced the Lingqu protocol, transitioning to an open standard from version 2.0, although it has not yet gained widespread acceptance in the domestic industry. The company aims to catch up with NVIDIA in supernode performance through a clustered approach [7][8]. - The Atlas 950 supernode, expected to be released in Q4 2026, will have a total computing power of 8 EFLOPS (FP8) and a memory capacity of 1,152 TB, significantly surpassing NVIDIA's offerings [7][8]. 3. Google - Google has established a mature optical interconnect supernode with its TPU series, including TPU v4, TPU v5p, and TPU v7, which have been recognized by external enterprises [9][10]. - The competitive advantage of Google's TPU supernode lies in its unique application of optical circuit switches (OCS), which creates a high barrier to entry in the optical interconnect field [9][10]. 4. AMD - AMD's UALink has become an important open standard, with the 1.0 version released in January 2025 and the 2.0 version expected in 2026. The UALink ecosystem is anticipated to see significant development by 2027, with over 100 member units supporting it [11]. - The Helios rack from AMD is positioned as a strong competitor to NVIDIA's NVL72 series, featuring a dual-width design that balances complexity, reliability, and performance [11]. 5. Investment Strategy - The report suggests a positive outlook for Google, AMD, and domestic supernode manufacturers, as well as for NVIDIA's supply chain, including PCB backplanes, high-speed copper cables, optical modules, and cooling systems [13][14]. - The market is expected to continue reassessing the value of Google, AMD, and domestic supernode sectors as competition intensifies [13].