Workflow
NVIDIA Rubin GPU
icon
Search documents
NVIDIA Kicks Off the Next Generation of AI With Rubin — Six New Chips, One Incredible AI Supercomputer
Globenewswire· 2026-01-05 22:20
Core Insights - NVIDIA launched the Rubin platform, featuring six new chips aimed at creating a powerful AI supercomputer that sets a new standard for AI system deployment and security at lower costs, facilitating mainstream AI adoption [2][4] - The platform utilizes extreme codesign across its components, which include the NVIDIA Vera CPU and NVIDIA Rubin GPU, to significantly reduce training time and inference costs [3][5] Innovations and Features - Rubin introduces five key innovations: advanced NVLink interconnect technology, Transformer Engine, Confidential Computing, RAS Engine, and the NVIDIA Vera CPU, enabling up to 10x lower inference costs and 4x fewer GPUs for training compared to the previous Blackwell platform [5][10] - The platform supports massive-scale mixture-of-experts (MoE) model inference, enhancing AI capabilities while reducing operational costs [5][10] Ecosystem and Partnerships - Major AI labs and cloud service providers, including AWS, Google, Microsoft, and OpenAI, are expected to adopt the Rubin platform, indicating broad ecosystem support [6][26] - Collaborations with companies like CoreWeave, Dell, and HPE aim to integrate Rubin into their AI infrastructures, enhancing performance and scalability [9][25] Performance and Efficiency - The Rubin platform is engineered to deliver exceptional performance, with the NVIDIA Vera Rubin NVL72 rack providing 260TB/s bandwidth, surpassing the entire internet's capacity [10][20] - Advanced Ethernet networking through Spectrum-6 technology enhances data center efficiency, achieving 5x better power efficiency and reliability compared to traditional methods [19][18] Market Readiness - The Rubin platform is in full production, with products expected to be available from partners in the second half of 2026, marking a significant step in AI infrastructure evolution [22][24] - Companies like Microsoft and AWS are set to deploy Rubin-based instances, further solidifying its role in next-generation AI capabilities [23][22]
NVIDIA (NasdaqGS:NVDA) 2026 Earnings Call Presentation
2026-01-05 21:00
Open Model Ecosystem - NVIDIA leads the open model ecosystem [14, 100] - 80% of startups are building on open models [10] - 1-in-4 OpenRouter tokens are generated by open models [10] AI Performance and Benchmarks - NVIDIA's Llama Nemotron Nano VL 8B achieves 70.2% in Text 4 Recognition, 69.1% in Text 4 Referring, 61.8% in Text 4 Spotting, 81.4% in Relation 4 Extraction, 39.2% in Element A Parsing, 31.9% in Mathematical 4 Calculation, and 73.1% in Visual Unders A [20] - nvidia/canary-gwen-2.5b achieves an average WER of 5.63 [26] New NVIDIA Technologies - NVIDIA announces Alpamayo, an open reasoning VLA for autonomous vehicles [61, 65] - NVIDIA ships full-stack AV on 2025 Mercedes Benz CLA [68] - NVIDIA Vera CPU features 88 custom Olympus cores, 176 threads, 1.8 TB/s NVLink-C2C, 1.5 TB system memory, 1.2 TB/s LPDDR5X, and 227 billion transistors [120] - NVIDIA Rubin GPU offers 50 PFLOPS NVFP4 Inference (5X Blackwell), 35 PFLOPS NVFP4 Training (3.5X), 22 TB/s HBM4 Bandwidth (2.8X), 3.6 TB/s NVLink Bandwidth per GPU (2X), and 336 billion transistors (1.6X) [122] - NVIDIA ConnectX-9 Spectrum-X SuperNIC provides 800 Gb/s Ethernet, programmable RDMA, line-speed encryption, and 23 billion transistors [125] - NVIDIA BlueField-4 offers 800G Gb/s DPU, 64 Core Grace CPU, 6X Compute, 2X Networking, 3X Memory BW, and 126 Billion Transistors [127] - NVIDIA NVLink 6 Switch scales up fabric with 3.6 TB/s per-GPU bandwidth and 108 billion transistors [131] - NVIDIA Vera Rubin NVL72 achieves 3.6 EFLOPS NVFP4 Inference (5X Blackwell), 2.5 EFLOPS NVFP4 Training (3.5X), 54 TB LPDDR5X Capacity (3X), 20.7 TB HBM Capacity (1.5X), 1.6 PB/s HBM4 Bandwidth (2.8X), 260 TB/s Scale-Up Bandwidth (2X), and 220 Trillion Transistors (1.7X) [134] - NVIDIA Spectrum-X Ethernet Co-Packaged Optics scales to 102.4 Tb/s with 200G silicon photonics and 352 billion transistors [136]
Counterpoint:需求强劲 台积电(TSM.US)3nm制程成为其史上最快达成全面利用的技术节点
智通财经网· 2025-05-15 12:39
Group 1 - TSMC has solidified its leading position in the global foundry market after inventory adjustments at the end of 2022, with high utilization rates in advanced process technologies [1] - The 3nm process has achieved full capacity utilization in its fifth quarter of mass production, driven by strong demand for Apple A17 Pro/A18 Pro chips and other application processors, setting a new record for initial market demand [1] - Future growth is expected to continue due to the introduction of NVIDIA Rubin GPUs and specialized AI chips from Google and AWS, driven by increasing demand in AI and high-performance computing (HPC) applications [1] Group 2 - In contrast, the smartphone market has seen slower initial capacity growth for existing processes like 7/6nm and 5/4nm, with the latter experiencing a resurgence in 2023 due to surging demand for AI acceleration chips [2] - The demand for AI computing chips is accelerating the construction of AI data centers and significantly enhancing the overall capacity of the 5/4nm process [2] Group 3 - The 2nm process is projected to achieve full capacity utilization in its fourth quarter of mass production, driven by dual demand from smartphones and AI applications, aligning with TSMC's strategic outlook [5] - Potential customers for the 2nm technology include Qualcomm, MediaTek, Intel, and AMD, which is expected to maintain high utilization rates for the 2nm process [5] Group 4 - TSMC is investing $165 billion in its Arizona facility to meet growing U.S. consumer demand and mitigate geopolitical risks, with the facility covering 4nm, 3nm, and 2nm processes [11] - The dual-layout strategy enhances TSMC's geopolitical resilience and ensures capacity meets customer demand, particularly in AI and HPC, while maintaining high utilization rates for advanced processes beyond 2030 [11]
TSMC 先进制程产能利用率持续保持强劲
Counterpoint Research· 2025-05-15 09:50
Core Viewpoint - TSMC has solidified its leading position in the global foundry market following inventory adjustments at the end of 2022, with high utilization rates in advanced process nodes showcasing its technological superiority [1][4]. Group 1: Advanced Process Utilization - The 3nm process node has achieved full utilization within five quarters of mass production, driven by strong demand for Apple A17 Pro/A18 Pro chips and other application processors, setting a new record for initial market demand in advanced processes [1]. - TSMC's 5/4nm process is experiencing a resurgence in demand, particularly due to the surge in AI accelerator chips like NVIDIA's H100 and B100, which has significantly boosted overall capacity [2][4]. - TSMC's advanced process utilization rates are projected to remain high, with expectations that the 2nm process will reach full capacity within four quarters of mass production, driven by dual demand from smartphones and AI applications [7]. Group 2: Future Developments and Investments - TSMC plans to allocate 30% of its 2nm process capacity to its Arizona facility, enhancing geopolitical resilience while ensuring capacity meets customer demand, especially in AI and high-performance computing [9]. - The company anticipates that the diverse customer base for the 2nm technology, including major players like Qualcomm, MediaTek, Intel, and AMD, will help maintain high utilization rates [7]. - TSMC's investment of $165 billion in its Arizona plant will support advanced process technologies, including 4nm, 3nm, and 2nm, ensuring the company remains at the forefront of the semiconductor industry [9].