Vera Rubin AI加速平台
Search documents
英伟达发布Rubin芯片,算力提升五倍,市场万亿美元
Xin Lang Ke Ji· 2026-03-16 22:23
Core Insights - Nvidia officially launched the Vera Rubin AI acceleration platform at the GTC 2026 conference, featuring a chip built on TSMC's 3nm process with 336 billion transistors, a 60% increase over the previous Blackwell generation [2] - The combined procurement orders for the Blackwell and Rubin architectures are expected to reach $1 trillion by 2027, double Nvidia's previous forecast [2] Group 1: Vera Rubin Platform Details - The Vera Rubin platform is a six-chip collaborative system, integrating a Vera CPU and two Rubin GPUs, along with four additional chips to form a complete AI factory infrastructure [3] - The Rubin GPU features 336 billion transistors, 288GB of HBM4 memory, and a memory bandwidth of 22TB/s, achieving inference performance of 50 PFLOPS and training performance of 35 PFLOPS, significantly surpassing Blackwell's capabilities [3] Group 2: Efficiency and Design Innovations - The Vera Rubin platform reduces inference token costs by 90% compared to Blackwell and decreases the number of GPUs needed for training mixture of experts (MoE) models by 75% [5] - The NVL72 rack features 100% liquid cooling and a modular design that reduces installation time from two hours to five minutes [5] Group 3: Future Developments - The Rubin Ultra system, set for release in 2027, will feature a new Kyber rack architecture with 576 GPUs, achieving an inference performance of 15 ExaFLOPS and a total memory capacity of 365TB [6] - Nvidia maintains a strict annual iteration schedule with planned releases for Blackwell (2024), Blackwell Ultra (2025), Rubin (2026), Rubin Ultra (2027), and Feynman (2028) [6] Group 4: Cloud Partnerships and Deployment - The Vera Rubin platform has entered mass production, with initial deployments scheduled for late 2026, including major cloud providers like AWS, Google Cloud, Microsoft Azure, and Oracle Cloud [7] - Microsoft plans to deploy the Vera Rubin NVL72 rack system for new AI data center projects, while CoreWeave will integrate Rubin systems into its AI cloud platform starting in late 2026 [7] Group 5: Strategic Vision and Expansion - Nvidia's narrative at GTC emphasizes the transition of AI from a tool to an "intelligent agent" paradigm, introducing the OpenClaw AI agent framework and the NemoClaw open-source project [8] - The company is also advancing the Vera Rubin Space-1 initiative to build a data center in orbit, aiming for computational power equivalent to 25 times that of the H100 [8] - Nvidia announced the Nvidia Groq 3 language processing unit (LPU), following a $20 billion acquisition of AI chip startup Groq, positioning itself against AMD in the inference market [8]