解耦推理
Search documents
AI存储,再度爆火
半导体行业观察· 2025-10-02 01:18
Core Viewpoint - The rapid development of AI has made storage a critical component in the AI infrastructure, alongside computing power. The demand for storage is surging due to the increasing data volume and inference scenarios driven by large models and generative AI. Three storage technologies—HBM, HBF, and GDDR7—are redefining the future landscape of AI infrastructure [1]. Group 1: HBM (High Bandwidth Memory) - HBM has evolved from a high-performance AI chip component to a strategic point in the storage industry, significantly impacting AI chip performance limits. In less than three years, HBM has achieved over twofold capacity and approximately 2.5 times bandwidth increase [3]. - SK Hynix is leading the HBM market, currently in the final testing phase for the sixth generation (HBM4) and has announced readiness for mass production. In contrast, Samsung is facing challenges in HBM4 supply to Nvidia, with a two-month delay in testing [3][5]. - A notable trend is the customization of HBM, driven by cloud giants developing their AI chips. SK Hynix is shifting towards a fully customized HBM approach, collaborating closely with major clients [4]. Group 2: HBF (High Bandwidth Flash) - HBF aims to address the limitations of traditional storage by combining the capacity of NAND flash with the bandwidth of HBM. Sandisk is leading the development of HBF technology, which is expected to meet the growing storage demands of AI applications [8][9]. - HBF is seen as complementary to HBM, suitable for specific applications requiring large block storage units. It is particularly advantageous in scenarios demanding high capacity but with relatively relaxed bandwidth requirements [10][11]. Group 3: GDDR7 - Nvidia's introduction of the Rubin CPX GPU, utilizing GDDR7 instead of HBM4, reflects a new approach to AI inference architecture. This design optimizes resource allocation by separating the inference process into two stages, effectively utilizing GDDR7 for context building [13]. - The demand for GDDR7 is increasing, with Samsung successfully meeting Nvidia's orders. This flexibility positions Samsung favorably in the graphics DRAM market [14]. - GDDR7's cost-effectiveness may drive the widespread adoption of AI inference infrastructure, potentially increasing overall market demand for high-end HBM due to the proliferation of applications [15]. Group 4: Industry Trends and Future Outlook - The collaborative evolution of storage technologies is crucial for the AI industry's growth. HBM remains essential for high-end training and inference, while HBF and GDDR7 cater to diverse market needs [23]. - The ongoing innovation in storage technology will accelerate as AI applications expand across various sectors, providing tailored solutions for both performance-driven and cost-sensitive users [23].
HBM,碰壁了
半导体行业观察· 2025-09-13 02:48
Core Viewpoint - The introduction of NVIDIA's Rubin CPX GPU, which opts for GDDR7 memory instead of the traditional HBM, raises questions about the future of HBM in AI applications and its potential threats from more cost-effective memory solutions [1][7]. Group 1: Rubin CPX GPU Overview - The Rubin CPX GPU was launched on September 10, 2023, specifically designed for long-context AI workloads, emphasizing a new inference acceleration concept called "disaggregated inference" [2]. - This GPU is not a simplified version of the standard Rubin GPU but is deeply optimized for inference performance, indicating a shift in focus from training to inference in AI applications [2][4]. - The Rubin CPX GPU is expected to provide up to 30 PFLOPs of raw computing power with 128 GB of GDDR7 memory, contrasting with the standard Rubin GPU's 50 PFLOPs and 288 GB of HBM4 memory [3]. Group 2: Architectural Differences - The architectural differences between Rubin CPX and standard Rubin GPU highlight a focus on task specialization, with Rubin CPX handling context construction and Rubin GPU managing generation tasks [5][9]. - The overall performance of the system with Rubin CPX is projected to reach 8 ExaFLOPs NVFP4, significantly surpassing previous models [4]. Group 3: Memory Transition and Implications - The shift from HBM4 to GDDR7 is driven by the need to reduce costs while maintaining performance, as GDDR7 provides sufficient bandwidth for the context-building tasks of the Rubin CPX GPU [9]. - This transition is expected to lower the total cost of systems, making AI infrastructure more accessible to a broader range of enterprises [9]. - The demand for GDDR7 is surging, with NVIDIA increasing orders from suppliers like Samsung, which is expanding production capabilities to meet this demand [10][12]. Group 4: Market Dynamics and Future Outlook - The introduction of GDDR7 is seen as a potential threat to HBM, but it also opens new opportunities for memory suppliers, particularly Samsung, which is poised to benefit from increased orders [10][12]. - SK Hynix has announced the completion of HBM4 development, indicating that while GDDR7 is gaining traction, HBM technology continues to evolve and remain relevant in the market [13].