研报 | 英伟达多元产品线分攻AI训练与推理需求,以应对CSP自研ASIC规模升级

Core Insights - NVIDIA is shifting its focus from cloud AI training to AI inference applications, emphasizing a diverse product line including GPU, CPU, and LPU to meet AI training and inference demands [2] - The market share of ASIC AI servers is projected to increase from 27.8% in 2026 to nearly 40% by 2030, driven by major cloud service providers like Google and Amazon expanding their in-house chip development [2] Group 1: NVIDIA's Strategies - To solidify its leadership in the AI market, NVIDIA is promoting integrated solutions like GB300 and VR200, which combine CPU and GPU for scalable AI inference applications [5] - The Vera Rubin system, introduced at GTC, is a highly integrated system featuring seven chips and five cabinets, expected to enhance NVIDIA's product offerings [5] - The GB300 has already replaced GB200 as the main product in Q4 2025, with an expected shipment share of nearly 80% by 2026 [6] Group 2: Technological Developments - NVIDIA is addressing latency and memory bandwidth bottlenecks in AI inference by introducing the Groq3 LPU, designed for low-latency inference with 500MB SRAM per chip, and up to 128GB per cabinet [6][7] - The "Disaggregated Inference" architecture proposed by NVIDIA aims to separate the processing of AI tasks, utilizing Vera Rubin for heavy computations and a larger memory LPU cabinet for latency-sensitive tasks [7] - The third-generation Groq LPU, manufactured by Samsung, has entered mass production and is expected to be shipped in the second half of 2026, with plans for a more efficient LP40 chip in the next generation [7]

研报 | 英伟达多元产品线分攻AI训练与推理需求,以应对CSP自研ASIC规模升级 - Reportify