首款HBM4 GPU，全面投产

Core Viewpoint - Nvidia's next-generation Rubin AI chip has entered full production and is set to launch in the second half of 2026, amid concerns about a potential "AI bubble" and the sustainability of large-scale AI infrastructure [1][3] Group 1: Rubin AI Chip Details - The Rubin GPU's inference performance is five times that of Blackwell, while its training performance is 3.5 times better, with inference token costs potentially reduced by up to 10 times [2][11] - The Rubin architecture features 336 billion transistors and can deliver 50 petaflops of performance when processing NVFP4 data, compared to Blackwell's maximum of 10 petaflops [2][11] - Rubin's training speed has increased by 250%, reaching 35 petaflops, with part of its computational power coming from an updated Transformer Engine module [2][3] Group 2: Market and Strategic Positioning - Nvidia's CEO Jensen Huang emphasized the timely launch of Rubin due to the explosive growth in AI training and inference demands, marking a significant step towards the next frontier in AI [3] - The company anticipates that its advanced Blackwell and Rubin chips will generate $500 billion in revenue by 2026, even without the Chinese or other Asian markets [5] - Nvidia has formed partnerships with several manufacturers and robotics companies, including BYD and Boston Dynamics, to expand AI applications in the physical world [5][6] Group 3: Technical Specifications and Innovations - Rubin will be the first GPU to integrate HBM4 memory chips, achieving a data transfer speed of 22 TB/s, significantly higher than Blackwell [3][10] - Each Rubin GPU is equipped with eight HBM4 memory stacks, providing 288GB of capacity and 22 TB/s bandwidth, essential for meeting the high computational demands of AI [7][12] - The NVLink 6 technology enhances inter-GPU communication, increasing bandwidth to 3.6 TB/s, which is crucial for the efficiency of large language models [7][12] Group 4: Future Developments and Ecosystem Readiness - Nvidia plans to release the Vera Rubin NVL72 AI supercomputer, which will consist of six types of chips, including the Vera CPU and Rubin GPU, designed for optimal performance in AI data centers [6][9] - The company is preparing its ecosystem for the adoption of the Vera-Rubin architecture, with cloud service providers like Microsoft Azure and CoreWeave set to be among the first to offer cloud computing services powered by Rubin [3][4]