英伟达首颗推理芯片，突然发布

Core Viewpoint - NVIDIA has introduced the Rubin CPX GPU, designed for massive context inference, significantly enhancing AI systems' ability to process millions of tokens in software encoding and video generation [8][9][10]. Group 1: Product Features - The Rubin CPX GPU is integrated into the NVIDIA Vera Rubin NVL144 CPX platform, which includes NVIDIA Vera CPUs and Rubin GPUs, providing an AI computing capability of 8 exaFLOPS [8]. - It features 128GB of GDDR7 memory and offers 30 PFLOPS of NVFP4 performance, which is three times the index computing power of the GB300 [4][11]. - The architecture allows for high memory bandwidth of 1.7 PB/s and 100 TB of fast memory within a single rack [8][11]. Group 2: Market Positioning - NVIDIA aims to leverage its technology to create large monolithic GDDR GPUs, a space where competitors like AMD and Intel have made less progress [7]. - The Rubin CPX is expected to transform AI coding assistants into complex systems capable of understanding and optimizing large software projects, addressing the limitations of traditional GPU computing [10][11]. Group 3: Future Outlook - The Rubin NVL144 CPX rack design incorporates many future technologies, with a target launch date set for the end of 2026 [7]. - The platform is designed to support the growing demand for AI applications, particularly in video processing and long-context inference tasks, which are becoming increasingly challenging [4][10].