NVSwitch

Search documents
算力芯片看点系列:如何理解Scale-up网络与高速SerDes芯片?
Soochow Securities· 2025-08-21 09:35
Investment Rating - The report maintains an "Overweight" rating for the electronic industry [1] Core Insights - In the AI chip Scale-up sector, NVIDIA is currently the dominant player, utilizing its proprietary NVLink technology to interconnect up to 576 GPUs with a communication speed of 1.8TB/s, significantly outperforming competitors using PCIe protocols [11][12] - The establishment of the UALink alliance by major companies like AMD, AWS, Google, and Cisco aims to create an open ecosystem, although challenging NVIDIA's NVLink remains difficult [11][12] - The report emphasizes the importance of high-speed SerDes technology, which is crucial for AI chip interconnectivity, and highlights the need for domestic development in this area to achieve self-sufficiency [45][46] Summary by Sections 1. Scale-up Overview - The report discusses the two main camps in AI chip interconnect technology: proprietary protocols and open ecosystems, with NVIDIA's NVLink being the most mature and effective solution [11][12] 2. NVLink and NVSwitch - NVLink is described as a layered protocol design that enhances data transmission reliability, while NVSwitch acts as a high-capacity switch facilitating efficient GPU communication [14][15] 3. NVIDIA's Interconnect Strategy - NVIDIA employs both NVLink for GPU-to-GPU connections and PCIe for GPU-to-CPU connections, with future developments potentially allowing direct NVLink connections to CPUs [21][30] 4. Domestic Alternatives for AI Chip Scale-up - The report suggests that achieving a domestic alternative to NVLink is challenging, but the UALink initiative may provide new opportunities for local AI chip development [45][46] 5. Investment Recommendations - The report recommends focusing on companies like 盛科通信 (Shengke Communication) and 海光信息 (Haiguang Information), while also monitoring 万通发展 (Wantong Development) and 澜起科技 (Lankai Technology) for potential investment opportunities [6]
博通用一颗芯片,单挑英伟达InfiniBand 和 NVSwitch
半导体行业观察· 2025-07-18 00:57
Core Viewpoint - InfiniBand has been a dominant structure for high-performance computing (HPC) and AI applications, but its market position is challenged by Broadcom's new low-latency Ethernet switch, Tomahawk Ultra, which aims to replace InfiniBand and NVSwitch in AI and HPC clusters [3][5][26]. Group 1: InfiniBand and Its Evolution - InfiniBand gained traction due to Remote Direct Memory Access (RDMA), allowing direct memory access between CPUs, GPUs, and other processing units, which is crucial for AI model training [3]. - Nvidia's acquisition of Mellanox Technologies for $6.9 billion was driven by the anticipated growth of generative AI, necessitating InfiniBand for GPU server connectivity [3][4]. - The rise of large language models and generative AI has propelled InfiniBand to new heights, with NVLink and NVSwitch providing significant advantages for AI server nodes [4]. Group 2: Broadcom's Tomahawk Ultra - Broadcom's Tomahawk Ultra aims to replace InfiniBand as the backend network for HPC and AI clusters, offering low-latency and lossless Ethernet capabilities [5][6]. - The development of Tomahawk Ultra predates the rise of generative AI, targeting applications sensitive to latency [5]. - Tomahawk Ultra's architecture allows for shared memory clusters, enhancing communication speed among processing units compared to traditional InfiniBand or Ethernet [5][6]. Group 3: Performance Metrics - InfiniBand's packet size typically ranges from 256 B to 2 KB, while Ethernet switches often handle larger packets, impacting performance in HPC workloads [7]. - InfiniBand has historically demonstrated lower latency compared to Ethernet, with significant improvements in latency metrics over the years, such as 130 nanoseconds for 200 Gb/s HDR InfiniBand [10][11]. - Broadcom's Tomahawk Ultra boasts a port-to-port jump latency of 250 nanoseconds and a throughput of 77 billion packets per second, outperforming traditional Ethernet switches [12][28]. Group 4: Competitive Landscape - InfiniBand's advantages in latency and packet throughput have made it a preferred choice for HPC workloads, but Ethernet technologies are rapidly evolving to close the gap [6][10]. - Nvidia's NVSwitch is also under threat from Broadcom's Tomahawk Ultra, which is part of a broader strategy to enhance Ethernet capabilities for AI and HPC applications [26][29]. - The introduction of optimized Ethernet headers and lossless features in Tomahawk Ultra aims to improve performance and compatibility with existing standards [15][16].
英伟达是靠钱堆出来了
半导体行业观察· 2025-03-31 01:43
Core Insights - Nvidia is positioned as a leader in the GPU market, with significant contributions from its CEO Jensen Huang and Chief Scientist Bill Dally, who focus on advancing technology and research [1][2][3] - The company's substantial R&D investments have been pivotal in establishing its dominance in high-performance computing (HPC), analytics, and AI workloads [3][4][5] R&D Investment and Financial Performance - Nvidia's R&D spending has historically been high, peaking at 34.2% of revenue in Q1 FY2015, reflecting its commitment to leveraging AI advancements [7][11] - Despite a recent decline in R&D spending as a percentage of revenue, Nvidia's total R&D budget is projected to grow significantly, reaching $12.91 billion in FY2025, a 48.9% increase from FY2024 [12][14] - The company has maintained a consistent R&D expenditure of 20% to 25% of revenue over the past 15 years, comparable to other tech giants like Meta and Google [11][12] Technological Advancements and Market Position - Nvidia's CUDA platform has been instrumental in creating a vast ecosystem of over 900 libraries and frameworks, solidifying its position in the AI and HPC markets [9][10] - The scarcity and high cost of High Bandwidth Memory (HBM) have allowed Nvidia to maintain a competitive edge over AMD, as it can afford to pay premium prices for necessary components [10][11] - Nvidia's research efforts are divided into supply-side and demand-side initiatives, focusing on enhancing GPU technology and expanding application areas for accelerated computing [16][18] Future Outlook and Strategic Direction - Nvidia is preparing for future waves of AI innovation, including what it terms "physical AI," indicating a proactive approach to emerging technologies [7][12] - The company is also exploring quantum computing and has established dedicated research teams to assess its potential impact [16][18] - Nvidia's strategy includes acquiring technologies from third parties and integrating them into its offerings, exemplified by its acquisition of Mellanox Technologies [28]
PCIE,博通的新芯片路线图
半导体行业观察· 2025-02-28 03:08
Core Viewpoint - The article discusses the evolution and significance of PCI-Express technology, particularly the upcoming PCI-Express 6.0, which aims to enhance bandwidth and reduce latency for AI and HPC systems. The transition to this new standard is crucial for meeting the demands of modern computing environments, especially in AI server architectures [1][2][3]. Group 1: PCI-Express Evolution - PCI-Express bandwidth increases every three years, with the latest version, PCI-Express 6.0, expected to double the data rate while maintaining latency [1][2]. - The transition to PAM-4 encoding in PCI-Express 6.0 allows for higher data rates but introduces challenges such as increased error rates, necessitating advanced error correction techniques [3][4]. - Broadcom has been a key player in the development of PCI-Express switches and retimers, with their latest products supporting both PCI-Express 5.0 and 6.0 standards [6][7]. Group 2: Market Demand and Applications - The demand for PCI-Express switches and retimers has surged, driven by the need for higher bandwidth in AI servers, which often require multiple GPUs and accelerators [7][8]. - A typical AI server equipped with eight GPUs utilizes four PCI-Express switches, highlighting the importance of high channel counts for performance [7]. - The complexity of AI server architectures necessitates robust PCI-Express solutions to facilitate communication between various components without a central CPU [8][9]. Group 3: Future Prospects - The introduction of PCI-Express 6.0 is seen as a pivotal step for the industry, with expectations for widespread adoption in AI server manufacturing by late 2024 [6][11]. - There is speculation about the potential for new architectures that could further enhance bandwidth beyond current PCI-Express capabilities, possibly resembling Nvidia's NVLink technology [9][10][11]. - The article emphasizes the need for a coherent telemetry system to support the growing complexity of AI ecosystems, which Broadcom aims to address through its interoperability development platform [8].