Core Viewpoint - NVIDIA aims to dominate the inference stack with its next-generation Feynman chip by integrating LPU units into its architecture, leveraging a licensing agreement with Groq for LPU technology [1][18]. Group 1: NVIDIA's Strategy and Technology Integration - NVIDIA plans to integrate Groq's LPU units into its Feynman GPU architecture, potentially using TSMC's hybrid bonding technology for stacking [1][3]. - The LPU modules are expected to enhance inference performance significantly, with Groq's LPU set to debut in 2028 [5]. - The Feynman core will utilize a combination of logic and compute chips, achieving high density and bandwidth while maintaining cost efficiency [6]. Group 2: Inference Market Dynamics - The AI industry's computational demands have shifted towards inference, with major companies like OpenAI and Google focusing on building robust inference stacks [9]. - Google’s Ironwood TPU is positioned as a competitor to NVIDIA, emphasizing the need for low-latency execution engines in large-scale data centers [9][10]. - Groq's LPU architecture is designed specifically for inference workloads, offering deterministic execution and on-chip SRAM for reduced latency [10][14]. Group 3: Licensing Agreement and Market Position - NVIDIA's agreement with Groq is framed as a non-exclusive licensing deal, allowing NVIDIA to integrate Groq's low-latency processors into its AI Factory architecture [18][21]. - This strategy is seen as a way to circumvent antitrust scrutiny while acquiring valuable talent and intellectual property from Groq [19][21]. - The transaction is viewed as a significant achievement for NVIDIA, positioning LPU as a core component of its AI workload strategy [16][21].
英伟达封死了ASIC的后路?