Nvidia-NVIDIA Unveils Rubin CPX: A New Class of GPU Designed for Massive-Context Inference

Core Insights - NVIDIA announced the launch of the NVIDIA Rubin CPX, a new GPU designed for massive-context processing, enabling AI systems to handle million-token software coding and generative video with exceptional speed and efficiency [1][3][17] Product Features - The Rubin CPX GPU is integrated with NVIDIA Vera CPUs and Rubin GPUs in the Vera Rubin NVL144 CPX platform, delivering 8 exaflops of AI compute, 100TB of fast memory, and 1.7 petabytes per second of memory bandwidth in a single rack, providing 7.5 times more AI performance than the NVIDIA GB300 NVL72 systems [2][17] - It features a monolithic die design with NVFP4 computing resources, offering up to 30 petaflops of compute with NVFP4 precision and 128GB of GDDR7 memory, enhancing performance and energy efficiency for AI inference tasks [5][6] - The Rubin CPX GPU supports long-context processing, allowing AI models to handle up to 1 million tokens for video content, integrating video decoders and encoders for advanced capabilities in long-format applications [4][6] Market Impact - Companies can achieve $5 billion in token revenue for every $100 million invested using the Vera Rubin NVL144 CPX platform, indicating significant monetization potential [7][17] - AI innovators like Cursor and Runway are exploring the capabilities of Rubin CPX to enhance developer productivity and accelerate content creation, respectively [8][9][10] Software and Ecosystem Support - The NVIDIA Rubin CPX will be supported by the complete NVIDIA AI stack, including the NVIDIA Dynamo platform for efficient AI inference scaling and the latest in the NVIDIA Nemotron family of multimodal models for enterprise-ready AI agents [11][12] - The Rubin platform extends NVIDIA's developer ecosystem, which includes over 6 million developers and nearly 6,000 CUDA applications [13] Availability - The NVIDIA Rubin CPX is expected to be available by the end of 2026 [18]