Nvidia-2026 年 GTC 展望：英伟达如何通过 LPX、CPO 与 Rubin 重新定义人工智能基础设施-GTC 2026 Outlook_ How NVIDIA Is Redefining AI Infrastructure with LPX, CPO, and Rubin

Summary of NVIDIA GTC 2026 Outlook Industry Overview - The document focuses on NVIDIA and its advancements in AI infrastructure, particularly in the context of generative AI and large language models, which are driving a redesign of data-center computing architectures [4][6]. Key Points and Arguments Innovations in AI Infrastructure - NVIDIA introduced the Blackwell GB200 NVL72, which allows a single rack to house 72 GPUs and 36 Grace CPUs, achieving 400 Gb/s networking through NVLink 6 and Quantum X800 InfiniBand [4]. - The company is expected to unveil new technologies at GTC 2026, including LPX inference racks, CPX and NVL144, and Rubin Ultra NVL576, which will feature significant advancements in PCB materials, cooling, and assembly processes [7]. LPX Inference Racks - LPX is designed for inference workloads, leveraging Groq's LPU technology to eliminate bandwidth bottlenecks by integrating large amounts of memory on-chip [11]. - The architecture allows for deterministic scheduling and near-linear scaling across multiple LPUs, making it suitable for large language models and Mixture-of-Experts models [15][16]. Performance Enhancements - NVIDIA plans to scale LPX from 64 to 256 LPUs per rack, aiming for a fourfold increase in performance, with the ability to generate 10,000 "thought tokens" in approximately two seconds [17][16]. - The introduction of M9 Q-glass PCBs will support the dense integration of LPUs, enhancing performance and efficiency [18]. Rubin Platform Advancements - The Rubin GPU, fabricated on a 4 nm process, integrates 336 billion transistors and supports 288 GB of HBM4 memory, achieving up to 50 PFLOPS of inference performance [30][32]. - The Rubin NVL72 rack integrates 72 Rubin GPUs and 36 Vera CPUs, delivering significant improvements in inference and training performance compared to previous models [37][38]. CPX and NVL144 for Long-Context Inference - The CPX GPU, a variant of the Rubin architecture, will utilize GDDR7 memory to support long-context inference workloads, achieving 3× higher performance than the previous generation [50][51]. - The NVL144 CPX rack will integrate 144 Rubin and CPX GPUs, delivering 8 EFLOPS of compute and 1.7 PB/s bandwidth, with a modular design that simplifies assembly [52][54]. Future Networking Solutions - NVIDIA is set to introduce Spectrum X Photonics and Quantum X CPO switches, which will enhance data center networking capabilities with significant bandwidth improvements [66][87]. - The CPO architecture aims to reduce power consumption and improve signal integrity, fundamentally reshaping data center networking [65][71]. Additional Important Insights - The document highlights the importance of energy efficiency and sustainability, noting that even with advancements, the NVL576 rack will require significant power management solutions [96]. - The evolution of the software ecosystem is crucial, as new compilers and memory management strategies will be necessary to maximize the efficiency of the new hardware [97]. - Supply chain security is a concern, particularly regarding advanced materials and photonic components, which are sensitive to geopolitical factors [99]. - NVIDIA faces competition from other companies developing similar technologies, which may pressure the company to continue innovating rapidly [100][101]. Conclusion - NVIDIA's GTC 2026 is positioned to redefine AI infrastructure through innovations in inference and training technologies, emphasizing the integration of optics, materials, and system design [105]. The advancements presented will have significant implications for the industry, necessitating collaboration and rapid iteration among technology providers [102][103][104].