NVLink 6交换机
Search documents
CES2026:英伟达六大芯片协同升级,算力+存力迈入新纪元
Xinda Securities· 2026-01-11 15:04
Investment Rating - The industry investment rating is "Positive" [2] Core Viewpoints - The release of the Nvidia Rubin platform marks a new era in AI computing power, with a complete transformation of global computing facilities towards the "AI factory" paradigm [3][39] - The Rubin platform features six new chips designed for AI supercomputers, significantly enhancing inference performance and reducing training costs [3][7] - The introduction of open-source models expands Nvidia's ecosystem, covering various fields including biomedical AI, physical AI, and autonomous driving [3][29] Summary by Sections Chip Performance - The Rubin GPU introduces a Transformer engine, achieving inference performance of 50 PFLOPS, which is five times that of the Blackwell GPU, while training performance reaches 35 PFLOPS, 3.5 times that of Blackwell [3][13] - The Vera CPU is designed for data movement and intelligent processing, featuring 88 custom Nvidia cores and a system memory of 1.5 TB, which is three times that of the Grace CPU [3][12] Storage Solutions - The Rubin platform addresses KV Cache issues with a new inference context memory storage platform, significantly enhancing memory performance and efficiency [3][18] - Each Rubin GPU can be equipped with up to 288 GB of HBM4, with total memory bandwidth increased to 22 TB/s, 2.8 times that of Blackwell [3][14] PCB and Rack Innovations - The transition to a cableless interconnect architecture in the Rubin NVL72 PCB significantly reduces assembly time by 18 times and lowers operational costs [3][22] - The system's collaborative design enhances efficiency, allowing for a reduction in the number of GPUs needed for training large models by 75% compared to the previous generation [3][25] Open Source Models - The expansion of Nvidia's open-source model ecosystem includes updates across six major areas, with a focus on the Nemotron series for various applications [3][32] - The Nemotron series includes models for inference, retrieval-augmented generation, safety, and speech processing [3][32] Physical AI Developments - The Cosmos model is designed for understanding and generating physical world videos, while Alpamayo serves as an open-source toolchain for autonomous driving, introducing reasoning capabilities [3][33][34]
首款HBM4 GPU,全面投产
半导体行业观察· 2026-01-06 01:42
Core Viewpoint - Nvidia's next-generation Rubin AI chip has entered full production and is set to launch in the second half of 2026, amid concerns about a potential "AI bubble" and the sustainability of large-scale AI infrastructure [1][3] Group 1: Rubin AI Chip Details - The Rubin GPU's inference performance is five times that of Blackwell, while its training performance is 3.5 times better, with inference token costs potentially reduced by up to 10 times [2][11] - The Rubin architecture features 336 billion transistors and can deliver 50 petaflops of performance when processing NVFP4 data, compared to Blackwell's maximum of 10 petaflops [2][11] - Rubin's training speed has increased by 250%, reaching 35 petaflops, with part of its computational power coming from an updated Transformer Engine module [2][3] Group 2: Market and Strategic Positioning - Nvidia's CEO Jensen Huang emphasized the timely launch of Rubin due to the explosive growth in AI training and inference demands, marking a significant step towards the next frontier in AI [3] - The company anticipates that its advanced Blackwell and Rubin chips will generate $500 billion in revenue by 2026, even without the Chinese or other Asian markets [5] - Nvidia has formed partnerships with several manufacturers and robotics companies, including BYD and Boston Dynamics, to expand AI applications in the physical world [5][6] Group 3: Technical Specifications and Innovations - Rubin will be the first GPU to integrate HBM4 memory chips, achieving a data transfer speed of 22 TB/s, significantly higher than Blackwell [3][10] - Each Rubin GPU is equipped with eight HBM4 memory stacks, providing 288GB of capacity and 22 TB/s bandwidth, essential for meeting the high computational demands of AI [7][12] - The NVLink 6 technology enhances inter-GPU communication, increasing bandwidth to 3.6 TB/s, which is crucial for the efficiency of large language models [7][12] Group 4: Future Developments and Ecosystem Readiness - Nvidia plans to release the Vera Rubin NVL72 AI supercomputer, which will consist of six types of chips, including the Vera CPU and Rubin GPU, designed for optimal performance in AI data centers [6][9] - The company is preparing its ecosystem for the adoption of the Vera-Rubin architecture, with cloud service providers like Microsoft Azure and CoreWeave set to be among the first to offer cloud computing services powered by Rubin [3][4]