NVLink

Search documents
NVIDIA's Data Center Grows Fast: Can It Sustain the Momentum?
ZACKS· 2025-08-22 14:56
Key Takeaways NVIDIA's Data Center revenues surged 73.3% year over year to $39.1 billion in Q1 fiscal 2026.Strong adoption of Blackwell and demand from hyperscalers like Microsoft, Google and Amazon drove growth.U.S. export approval for H20 chips to China and new AI chip plans further support Data Center momentum.NVIDIA Corporation’s (NVDA) Data Center business has become its primary growth engine, driven by increasing demand for artificial intelligence (AI) infrastructure. In the first quarter of fiscal 20 ...
算力芯片看点系列:如何理解Scale-up网络与高速SerDes芯片?
Soochow Securities· 2025-08-21 09:35
Investment Rating - The report maintains an "Overweight" rating for the electronic industry [1] Core Insights - In the AI chip Scale-up sector, NVIDIA is currently the dominant player, utilizing its proprietary NVLink technology to interconnect up to 576 GPUs with a communication speed of 1.8TB/s, significantly outperforming competitors using PCIe protocols [11][12] - The establishment of the UALink alliance by major companies like AMD, AWS, Google, and Cisco aims to create an open ecosystem, although challenging NVIDIA's NVLink remains difficult [11][12] - The report emphasizes the importance of high-speed SerDes technology, which is crucial for AI chip interconnectivity, and highlights the need for domestic development in this area to achieve self-sufficiency [45][46] Summary by Sections 1. Scale-up Overview - The report discusses the two main camps in AI chip interconnect technology: proprietary protocols and open ecosystems, with NVIDIA's NVLink being the most mature and effective solution [11][12] 2. NVLink and NVSwitch - NVLink is described as a layered protocol design that enhances data transmission reliability, while NVSwitch acts as a high-capacity switch facilitating efficient GPU communication [14][15] 3. NVIDIA's Interconnect Strategy - NVIDIA employs both NVLink for GPU-to-GPU connections and PCIe for GPU-to-CPU connections, with future developments potentially allowing direct NVLink connections to CPUs [21][30] 4. Domestic Alternatives for AI Chip Scale-up - The report suggests that achieving a domestic alternative to NVLink is challenging, but the UALink initiative may provide new opportunities for local AI chip development [45][46] 5. Investment Recommendations - The report recommends focusing on companies like 盛科通信 (Shengke Communication) and 海光信息 (Haiguang Information), while also monitoring 万通发展 (Wantong Development) and 澜起科技 (Lankai Technology) for potential investment opportunities [6]
AI算力跟踪深度(三):从英伟达的视角看算力互连板块成长性:ScaleUp网络的“ScalingLaw”存在吗?
Soochow Securities· 2025-08-20 05:35
行业研究报告 2025年8月20日 请务必阅读正文之后的免责声明部分 核心观点 我们认为Scale Up网络存在Scaling Law,Scale Up柜间第二层网络会逐渐出现,光+AEC连接多出与 芯片1:9的配比需求,交换机多出与芯片4:1的配比需求,相较Scale Out网络均倍增: 1.英伟达持续扩大Scale Up规模:英伟达正通过两大路径持续扩大Scale Up网络规模。2)提升单卡带 宽:NVLink持续迭代,NVLink 5.0单卡带宽达7200Gb/s;2)扩大超节点规模:Scale Up超节点规模 不断扩大,从H100 NVL8到GH200再到GB200等,NVL72等机柜方案可以提高训推效率,但并不是 Scale Up的上限, NVL72等机柜后续会作为最小的节点(Node)存在,像积木一样在柜与柜之间进 一步拼出更大的Scale Up超节点,届时需要光连接等进行通信。 AI算力跟踪深度(三): 从英伟达的视角看算力互连板块成长性—— Scale Up 网络的"Scaling Law"存在吗? 证券分析师 :张良卫 执业证书编号:S0600516070001 联系邮箱:zhanglw@d ...
国内外AI服务器Scale up方案对比
傅里叶的猫· 2025-08-18 15:04
Core Viewpoint - The article discusses the comparison of Scale Up solutions among major domestic and international companies in AI data centers, highlighting the importance of high-performance interconnect technologies and architectures for enhancing computational capabilities. Group 1: Scale Up Architecture - Scale Up enhances computational power by increasing the density of individual servers, integrating more high-performance GPUs, larger memory, and faster storage to create "super nodes" [1] - It is characterized by high bandwidth and low latency, making it suitable for AI inference and training tasks [1] - Scale Up often combines with Scale Out to balance single-machine performance and overall scalability [1] Group 2: NVIDIA's NVLink Technology - NVIDIA employs its self-developed NVLink high-speed interconnect technology in its Scale Up architecture, achieving high bandwidth and low latency for GPU interconnects [3] - The GB200 NVL72 cabinet architecture integrates 18 compute trays and 9 NVLink switch trays, utilizing copper cables for efficient interconnect [3] - Each compute tray contains 2 Grace CPUs and 4 Blackwell GPUs, with NVSwitch trays equipped with NVSwitch5 ASICs [3] Group 3: Future Developments - NVIDIA's future Rubin architecture will upgrade to NVLink 6.0 and 7.0, significantly enhancing bandwidth density and reducing latency [5] - These improvements aim to support the training of ultra-large AI models with billions or trillions of parameters, addressing the growing computational demands [5] Group 4: Other Companies' Solutions - AMD's UALink aims to provide an open interconnect standard for scalable accelerator connections, supporting up to 1024 accelerators with low latency [16] - AWS utilizes the NeuronLink protocol for horizontal scaling, enhancing interconnect capabilities through additional switch trays [21] - Meta employs Broadcom's SUE solution for horizontal scaling, with plans to consider NVIDIA's NVLink Fusion in future architectures [24] Group 5: Huawei's Approach - Huawei adopts a multi-cabinet all-optical interconnect solution with its Cloud Matrix system, deploying Ascend 910C chips across multiple racks [29] - The Cloud Matrix 384 configuration includes 6912 optical modules, facilitating both Scale Up and Scale Out networks [29]
增长迅猛如火箭!网络业务成英伟达(NVDA.US)AI芯片霸主地位隐形支柱
智通财经网· 2025-08-11 02:41
Core Viewpoint - The focus of investors on NVIDIA's Q2 earnings report will be on its data center business, which is crucial for revenue generation through high-performance AI processors [1] Group 1: Data Center Business - NVIDIA's data center segment generated $115.1 billion in revenue last fiscal year, with the network business contributing $12.9 billion, surpassing the gaming segment's revenue of $11.3 billion [1] - In Q1, the network business contributed $4.9 billion to the data center revenue of $39.1 billion, indicating strong growth potential as AI computing power expands [2] Group 2: Network Technology - NVIDIA's network products, including NVLink, InfiniBand, and Ethernet solutions, are essential for connecting chips and servers within data centers, enabling efficient AI application performance [1][2] - The three types of networks—NVLink for intra-server communication, InfiniBand for inter-server connections, and Ethernet for storage and system management—are critical for building large-scale AI systems [3] Group 3: Importance of Network Business - The network business is considered one of the most undervalued parts of NVIDIA's operations, with its growth rate described as "rocket-like" despite only accounting for 11% of total revenue [2] - Without the network business, NVIDIA's ability to meet customer expectations for computing power would be significantly compromised [3] Group 4: AI Model Development - As enterprises develop larger AI models, the need for synchronized GPU performance is increasing, particularly during the inference phase, which demands higher data center system performance [4] - The misconception that inference is simple has been challenged, as it is becoming increasingly complex and similar to training, highlighting the importance of network technologies [5] Group 5: Competitive Landscape - Competitors like AMD, Amazon, Google, and Microsoft are developing their own AI chips and network technologies, posing a challenge to NVIDIA's market position [5] - Despite the competition, NVIDIA is expected to maintain its lead as demand for its chips continues to grow among tech giants, research institutions, and enterprises [5]
本周,反转不断
Shang Hai Zheng Quan Bao· 2025-08-10 15:46
Group 1: Nvidia's Network Technology Business - Nvidia's network technology business is rapidly growing and has become a dark horse, with revenue from this segment reaching $12.9 billion, surpassing the gaming segment's revenue of $11.4 billion [3] - The network business, which includes NVLink, InfiniBand, and Ethernet solutions, is crucial for enabling AI applications by facilitating communication between chips and servers [3] - Despite only accounting for 11% of the data center business revenue, the network segment is experiencing significant growth and is gaining more attention from investors [3] Group 2: Starlink's Declining Appeal in Kenya - Starlink's appeal in Kenya is diminishing due to high costs and low speeds, leading users to switch to local providers that offer faster and cheaper services [5] - In the first quarter of 2023, Starlink lost over 2,000 users in Kenya, which represents more than 10% of its local user base [5] - Originally aimed at bypassing government internet access regulations, Starlink's service is now facing challenges in urban areas, raising questions about its future in remote towns [5] Group 3: Brazil's Oil Exploration Amid Climate Leadership Claims - Brazilian President Lula has shifted his stance on the development of key minerals and rare earths, now viewing it as a matter of national sovereignty while still engaging in negotiations with other countries [7] - Despite hosting the upcoming COP30 climate conference, Brazil is increasing oil exploration, with the national oil company seeking to find more oil off the Amazon coast [7] - Oil is projected to surpass soybeans as Brazil's top export product in 2024, highlighting the country's reliance on oil production to support its economy [7] Group 4: Oil Market Tensions - The oil market remains tense as OPEC's attempts to increase production during the summer high demand season have proven difficult [9] - Some member countries are struggling to boost output, while others face restrictions due to previous overproduction [9] - Brent crude futures have recently risen to around $68 per barrel, significantly higher than the low of approximately $58 in April 2025, indicating ongoing market volatility [9]
PCIe,狂飙20年
半导体行业观察· 2025-08-10 01:52
Core Viewpoint - The release of the PCIe 8.0 standard marks a significant milestone in the evolution of PCIe technology, doubling the data transfer rate to 256GT/s and reinforcing its critical role in high-speed data transfer across various computing environments [1][38]. Group 1: Evolution of PCIe Technology - PCIe, introduced by Intel in 2001, has evolved from the original PCI standard, which had a maximum bandwidth of 133 MB/s, to a series of iterations that have consistently doubled the data transfer rates [3][14]. - The transition from PCI to PCIe represents a shift from parallel bus technology to a serial communication mechanism, significantly enhancing data transfer efficiency and reducing signal interference [9][11]. - The PCIe 1.0 standard initiated the serial interconnect revolution with a transfer rate of 2.5GT/s, while subsequent versions have seen substantial increases, culminating in the upcoming PCIe 8.0 [14][38]. Group 2: Key Features of PCIe - PCIe's architecture includes three core features: serial communication, point-to-point connections, and scalable bandwidth capabilities, which collectively enhance performance and reduce latency [9][11]. - The introduction of advanced signal processing techniques, such as CTLE in PCIe 3.0 and PAM4 modulation in PCIe 6.0, has been pivotal in maintaining signal integrity and supporting higher data rates [18][24]. - PCIe 8.0 is set to introduce new connector technologies and optimize latency and error correction mechanisms, ensuring reliability and efficiency in high-bandwidth applications [42][38]. Group 3: Market Applications and Trends - PCIe technology is predominantly utilized in cloud computing, accounting for over 50% of its market share, with increasing adoption in automotive and consumer electronics sectors [46][49]. - The demand for high-speed interconnects is driven by the growth of AI applications, high-performance computing, and data-intensive workloads, positioning PCIe as a foundational technology in these areas [45][51]. - Predictions indicate that the PCIe market in AI applications could reach $2.784 billion by 2030, with a compound annual growth rate of 22% [51]. Group 4: Competitive Landscape and Challenges - PCIe faces competition from proprietary interconnect technologies like NVLink and CXL, which offer higher bandwidth and lower latency for GPU communications [55][63]. - The establishment of the UALink alliance aims to create open standards for GPU networking, challenging the dominance of proprietary solutions and enhancing interoperability [56]. - Despite its established position, PCIe must navigate challenges related to bandwidth limitations and evolving market demands, necessitating continuous innovation and adaptation [64][71].
AI 网络Scale Up专题会议解析
傅里叶的猫· 2025-08-07 14:53
Core Insights - The article discusses the rise of AI Networking, particularly focusing on the "Scale Up" segment, highlighting its technological trends, vendor dynamics, and future outlook [1] Group 1: Market Dynamics - The accelerator market is divided into "commercial market" led by NVIDIA and "custom market" represented by Google TPU and Amazon Tranium, with the custom accelerator market expected to gradually match the GPU market in size [3] - Scale Up networking is transitioning from a niche market to mainstream, with revenue projected to exceed $1 billion by Q2 2025 [3] - The total addressable market (TAM) for AI Network Scale Up is estimated at $60-70 billion, with potential upward revisions to $100 billion [12] Group 2: Technological Evolution - AI networking has evolved from "single network" to "dual network," currently existing in a phase of "multiple network topologies," with Ethernet expected to dominate in the long term [4] - The competition between Ethernet and NVLink is intensifying, with NVLink currently leading due to its maturity, but Ethernet is expected to gain market share over the decade [5] - Scale Up is defined as a "cache coherent GPU to GPU network," providing significantly higher bandwidth compared to Scale Out, with expectations of market size surpassing Scale Out by 2035 [8] Group 3: Performance and Cost Analysis - Scale Up technology shows a significant performance advantage, with latency for Scale Up products like Broadcom's Tomahawk Ultra at approximately 250ns, compared to 600-700ns for Scale Out [9] - Cost-wise, Scale Up Ethernet products are projected to be 2-2.5 times more expensive than Scale Out products, indicating a higher investment requirement for Scale Up solutions [9] Group 4: Vendor Strategies - Different vendors are adopting varied strategies in the Scale Up domain, with NVIDIA focusing on NVLink, AMD betting on UA Link, and major cloud providers like Google and Amazon transitioning towards Ethernet solutions [13] - The hardware landscape is shifting towards embedded designs in racks, with a potential increase in the importance of software for network management and congestion control as Scale Up matures [13]
NVLink, UALink, NeuronLink, SUE, PCIe – Astera Labs Switch
2025-08-05 08:17
Summary of Astera Labs (ALAB US) Conference Call Company Overview - **Astera Labs** is a U.S.-listed company specializing in PCIe retimer and switch chips, with a focus on the upcoming custom Scorpio-X switch chip [1] Key Industry Insights - **Growth Drivers**: Astera Labs' growth is driven by two main products: - Custom **NeuronLink** switch chip for AWS's Trainium series, launching in the second half of the year - Custom **UALink** switch chip for AMD's MI400 series, expected in the second half of next year [2] Technical Comparisons - **UALink vs. NVLink**: - UALink uses SerDes with differential signaling, allowing longer-distance data transmission compared to NVLink's single-ended signaling, which saves chip area but limits distance [3][4] - UALink can connect up to 1,024 nodes, while NVLink is limited to 576 nodes [5] - **UALink Protocol Versions**: - UALink has two versions: 128 Gbps and 200 Gbps, with the latter being suitable for GPU-to-GPU connections only [6][9] - UALink 128G supports mixed connections and is compatible with PCIe Gen7, making it suitable for model inferencing [9] - **Broadcom's SUE**: - SUE is a point-to-point protocol that draws from NVLink's logic but has limitations in heterogeneous expansion compared to UALink [10] Product Development - **AMD's Helios AI Rack**: - The upcoming Helios AI rack will adopt the UALink 200G protocol, with Astera Labs developing a switch chip expected to tape out in Q1 2026 [11][31] - **AWS Trainium Series**: - Astera Labs is developing the Scorpio-X switch chip for AWS's Trainium rack, which will be software-programmable and meet high-performance transmission requirements [13] Financial Projections - **Revenue Estimates**: - For every one million Trainium 2.5 chips deployed, Astera Labs could generate a content dollar value of approximately **$1.75 billion** from both large and small switch chips [22] - For Trainium 3 chips, the estimated content dollar value could reach **$3.3 billion** per million chips [26] - Additional revenue of **$150 million** is projected for every million Trainium 4 chips due to collaboration with Alchip [28] - **AMD's MI400 Series**: - Astera Labs' content dollar value for every one million MI400 GPUs used in the Helios rack is estimated at **$576 million** [32] Conclusion - Astera Labs is positioned to capitalize on the growing demand for advanced interconnect solutions in high-performance computing environments, particularly through its partnerships with AWS and AMD, with significant revenue potential from its innovative switch chip technologies [1][2][22][26][32]
规模化人工智能网络数据解读_对规模化人工智能及首选技术的关键预测-Hardware & Networking_ Scale-Up AI Networking in Numbers_ Key Forecasts from 650 Group for Scale-Up AI and Technology of Choice
2025-08-05 03:20
Summary of Key Points from the Conference Call on Scale-Up AI Networking Industry Overview - The conference call focused on the **AI Networking** industry, specifically discussing **Scale-Up AI Networking** and its growth forecasts as provided by **650 Group** in collaboration with **J.P. Morgan** [1][3]. Core Insights and Arguments - **AI Networking Growth**: The total addressable market (TAM) for AI networking is projected to grow from **$15 billion in 2024** to **$65 billion in 2029**, representing a **34% compound annual growth rate (CAGR)** over the next five years. This growth is supported by strong increases in both front-end and back-end revenues [1][3]. - **Scale-Up vs. Scale-Out Revenues**: - Scale-Up AI Networking is expected to grow at a **123% CAGR**, reaching **$21 billion by 2029**, while Scale-Out revenues are projected to grow from **$11.7 billion in 2024** to **$28.8 billion in 2029**, implying a **20% CAGR** [3][6]. - By 2029, Scale-Up revenues are forecasted to comprise **43% of all back-end AI revenues**, up from just **3% in 2024** [3][6]. - **Long-Term Outlook**: Although Scale-Up revenues will not exceed 50% of total AI back-end revenues by 2029, analysts expect them to eventually eclipse Scale-Out revenues in the following decade due to increasing demand for multi-rack scale-up technologies and higher-bandwidth solutions like silicon photonics [6]. - **Shift to Ethernet Connectivity**: - The industry is anticipated to converge towards Ethernet connectivity, even for Merchant ASICs, with a forecasted growth of **22% CAGR** for these products, increasing from **4.4 million units in 2024** to **11.9 million units in 2029** [9]. - Custom ASICs are also expected to transition to Ethernet, with a **17% CAGR** growth from **5.0 million units in 2024** to **10.7 million units in 2029** [9]. - **Market Share Dynamics**: - NVLink is projected to maintain a **96% market share** in the Scale-Up Networking market by 2029, although its share will decrease to **63%** as Ethernet-based solutions grow to **$7 billion**, capturing **31% of the market** [11]. - The Scale-Out TAM is expected to be dominated by Ethernet, with limited growth for Infiniband, positioning Ethernet networking suppliers favorably [15]. Additional Important Insights - The forecasts suggest potential upsides rather than downsides, driven by current momentum in Cloud capital expenditures [1]. - The transition to Ethernet is seen as beneficial due to operational simplicity and multi-vendor interoperability, which are critical for the evolving networking landscape [11]. This summary encapsulates the key points discussed during the conference call, highlighting the growth potential and market dynamics within the AI Networking sector.