Core Insights - The article discusses the evolution of AI data centers, highlighting a shift from merely increasing computational power to enhancing overall system efficiency, particularly with the introduction of NVIDIA's Rubin platform and BlueField-4 [1][18]. Group 1: Rubin Platform Overview - The Rubin platform represents a departure from traditional single-component upgrades, focusing instead on a system-level design that integrates multiple chips for enhanced efficiency [2]. - The Rubin GPU features a dual-chip design with approximately 336 billion transistors and supports up to 50 PFLOPS of NVFP4 computing power, tailored for AI inference tasks [3]. - The Vera CPU, designed for system efficiency, incorporates 88 custom Olympus cores and supports NVIDIA Spatial Multithreading, allowing for up to 176 concurrent threads [3]. Group 2: Connectivity and Performance - The sixth-generation NVLink switch increases interconnect bandwidth to 3.6 TB/s per GPU, enabling 72 GPUs to work collaboratively, significantly reducing overhead from model partitioning and communication [4]. - The new Rubin platform reduces AI inference token costs to about one-tenth of the previous Blackwell platform, while the GPU requirements for MoE model training are reduced to approximately one-quarter [4]. Group 3: BlueField-4 and Infrastructure Upgrades - BlueField-4 addresses the efficiency of computational power by offloading storage and network management tasks from CPUs and GPUs, thus allowing for more effective use of computational resources [6][8]. - The integration of BlueField-4 with Spectrum-X and Spectrum-6 networks enhances low-latency data transfer, crucial for real-time AI applications [10][11]. - The new architecture allows for a seamless data flow between computation, storage, and networking, marking a significant shift from traditional data center designs [11]. Group 4: Value Creation and Future Directions - The collaboration between Rubin and BlueField-4 creates a complete value loop for AI-native data centers, optimizing the interaction between computation and memory management [14]. - The design is scalable, allowing for the integration of thousands of GPUs into a cohesive AI computing platform, meeting future demands for larger AI applications [16]. - The article emphasizes that the true innovation lies not just in performance metrics but in a fundamental shift in how AI infrastructure is conceptualized and built [18][19].
不再卷算力的2026,英伟达开始重做数据中心