NVIDIA Vera Rubin NVL144 CPX平台
Search documents
刚刚,英伟达祭出下一代GPU!狂飙百万token巨兽,投1亿爆赚50亿
猿大侠· 2025-09-11 04:11
Core Viewpoint - NVIDIA has launched the Rubin CPX, a new GPU designed for large-scale context inference, which significantly enhances AI performance and efficiency in handling millions of tokens [1][10]. Performance and Capabilities - Rubin CPX offers over 2 times the performance of the Vera Rubin NVL144 platform and 7.5 times that of the Blackwell Ultra-based GB300 NVL72 system [3][4]. - It features 8 EFLOPS of NVFP4 computing power, 100TB of high-speed memory, and 1.7 PB/s memory bandwidth, along with 128GB of cost-effective GDDR7 memory [3][31]. - The GPU provides 3 times the attention mechanism processing capability compared to the NVIDIA GB300 NVL72 system [4][34]. Economic Impact - For every $100 million invested, Rubin CPX can generate up to $5 billion in token revenue, indicating a potential ROI of 30-50 times [6][25]. - The introduction of Rubin CPX is expected to redefine the inference economy by optimizing resource utilization and reducing latency through decoupled inference [14][24]. Architectural Innovations - Rubin CPX is built on the Rubin architecture and is the first CUDA GPU specifically designed for massive context AI, capable of reasoning across millions of knowledge tokens simultaneously [8][12]. - The platform supports multi-step reasoning, persistent memory, and long-term context, making it suitable for complex tasks in software development, video generation, and deep research [10][23]. Infrastructure and Ecosystem - The NVIDIA Vera Rubin NVL144 CPX platform integrates Rubin CPX with NVIDIA Vera CPU and Rubin GPU, forming a complete high-performance decoupled service solution for long context scenarios [23][30]. - The platform is designed for scalable deployment and can be configured with various networking technologies, including NVIDIA Quantum-X800 InfiniBand and Spectrum-X Ethernet [35]. Future Prospects - Rubin CPX is expected to be available by the end of 2026 and will unlock powerful capabilities for developers and creators, redefining the possibilities for building next-generation generative AI applications [37][38].
英伟达下一代GPU登场,Rubin CPX一次推理数百万Token,网友:这是头野兽
机器之心· 2025-09-10 08:14
机器之心报道 机器之心编辑部 在周二的 AI 基础设施峰会上,英伟达宣布推出一款名为 Rubin CPX(Rubin Context GPUs) 的新 GPU,专为超过 100 万 token 的长上下文推理而设计。 对用户而言,这意味着他们在软件开发、视频生成等长上下文任务中能够获得更好的性能。 例如,在软件开发中,AI 系统必须能够对整个代码库进行推理、理解仓库级代码结构,才能更好的帮助开发者。同样地,长视频和研究类应用也要求在数百万 token 范围内保持持续的连贯性和记忆。 现在,随着 Rubin CPX 发布,这些问题都能迎刃而解。 这款新型 GPU(Rubin CPX) 将与 NVIDIA Vera CPU 和 Rubin GPU 搭配使用,共同组成全新的 NVIDIA Vera Rubin NVL144 CPX 平台。这一集成式 NVIDIA MGX 系统在单机架内可提供 8 exaflops AI 算力,其 AI 性能是 NVIDIA GB300 NVL72 系统的 7.5 倍,并配备 100TB 高速内存和 1.7 PB/s(petabytes)内存带宽。 同时,NVIDIA 还将为已有 V ...