Workflow
3D堆叠SRAM
icon
Search documents
英伟达拟发布“神秘芯片” 或是专为推理设计的新架构
Core Insights - NVIDIA is set to unveil a groundbreaking chip at the GTC conference in mid-March, which is expected to integrate Groq's LPU technology for a new inference product [1][4] - The shift in global computing demand is moving from training to inference, with predictions indicating that by 2026, inference will account for two-thirds of all AI computing power [3] - The new chip is anticipated to enhance decoding efficiency, addressing the limitations of current GPU architectures in handling large model parameters [5][6] Group 1: Chip Development and Technology - The upcoming chip is likely to be a new inference chip system that incorporates Groq's LPU technology, marking a significant integration of external architecture into NVIDIA's core AI computing product line [4] - The Groq LPU is designed specifically for inference acceleration, utilizing SRAM for model parameter storage, which offers significantly higher memory bandwidth compared to traditional GPU architectures [6] - NVIDIA may adopt a 3D stacking approach similar to AMD's V-Cache technology, integrating LPU units directly on top of GPU cores to enhance performance [7][8] Group 2: Market Trends and Predictions - The market is expected to see the emergence of specialized inference chips worth billions, which will be deployed in data centers and enterprise servers, with some chips potentially having power consumption comparable to general AI chips [3] - The industry is witnessing a trend where advanced manufacturing processes are becoming increasingly critical, with a focus on achieving high interconnect density and energy efficiency in chip designs [10] - There is a potential risk for domestic packaging and testing companies to be pushed out of the high-end market as the value of advanced chips concentrates on front-end manufacturing and advanced packaging [10]