Core Viewpoint - NVIDIA has launched the Rubin CPX, a new GPU designed for large-scale context inference, which significantly enhances AI performance and efficiency in handling millions of tokens [1][10]. Performance and Capabilities - Rubin CPX offers over 2 times the performance of the Vera Rubin NVL144 platform and 7.5 times that of the Blackwell Ultra-based GB300 NVL72 system [3][4]. - It features 8 EFLOPS of NVFP4 computing power, 100TB of high-speed memory, and 1.7 PB/s memory bandwidth, along with 128GB of cost-effective GDDR7 memory [3][31]. - The GPU provides 3 times the attention mechanism processing capability compared to the NVIDIA GB300 NVL72 system [4][34]. Economic Impact - For every $100 million invested, Rubin CPX can generate up to $5 billion in token revenue, indicating a potential ROI of 30-50 times [6][25]. - The introduction of Rubin CPX is expected to redefine the inference economy by optimizing resource utilization and reducing latency through decoupled inference [14][24]. Architectural Innovations - Rubin CPX is built on the Rubin architecture and is the first CUDA GPU specifically designed for massive context AI, capable of reasoning across millions of knowledge tokens simultaneously [8][12]. - The platform supports multi-step reasoning, persistent memory, and long-term context, making it suitable for complex tasks in software development, video generation, and deep research [10][23]. Infrastructure and Ecosystem - The NVIDIA Vera Rubin NVL144 CPX platform integrates Rubin CPX with NVIDIA Vera CPU and Rubin GPU, forming a complete high-performance decoupled service solution for long context scenarios [23][30]. - The platform is designed for scalable deployment and can be configured with various networking technologies, including NVIDIA Quantum-X800 InfiniBand and Spectrum-X Ethernet [35]. Future Prospects - Rubin CPX is expected to be available by the end of 2026 and will unlock powerful capabilities for developers and creators, redefining the possibilities for building next-generation generative AI applications [37][38].
刚刚,英伟达祭出下一代GPU!狂飙百万token巨兽,投1亿爆赚50亿