约束诱导的创新
Search documents
大摩眼中的DeepSeek:以存代算、以少胜多
3 6 Ke· 2026-01-22 09:09
Core Insights - DeepSeek is revolutionizing AI scalability by utilizing a hybrid architecture that replaces scarce high-bandwidth memory (HBM) with more cost-effective DRAM through an innovative module called "Engram" [1][3][5] Group 1: Engram Module and Conditional Memory - The Engram module introduces "Conditional Memory," separating static knowledge storage from dynamic reasoning, which significantly reduces reliance on expensive HBM [3][5] - This architecture allows for efficient retrieval of basic information without overloading HBM, thus freeing up capacity for more complex reasoning tasks [3][5] Group 2: Economic Impact on Infrastructure - The Engram architecture reshapes hardware cost structures by minimizing HBM dependency, potentially shifting infrastructure costs from GPUs to more affordable DRAM [5][6] - A 100 billion parameter Engram model requires approximately 200GB of system DRAM, indicating a 13% increase in the use of commodity DRAM per system [5][6] Group 3: Innovation Driven by Constraints - Despite limitations in advanced computing power and hardware access, Chinese AI models have rapidly closed the performance gap with global leaders, demonstrating "constraint-induced innovation" [6][7] - DeepSeek's advancements suggest that future AI capabilities may rely more on algorithmic and system-level innovations rather than merely increasing hardware resources [6][7] Group 4: Future Outlook - The upcoming DeepSeek V4 model is expected to achieve significant advancements in encoding and reasoning, potentially running on consumer-grade hardware like the RTX 5090 [7] - This development could lower the marginal costs of high-level AI inference, enabling broader deployment of AI applications without the need for expensive data center-grade GPU clusters [7]
大摩眼中的DeepSeek:以存代算、以少胜多!
硬AI· 2026-01-22 07:34
Core Viewpoint - DeepSeek is redefining the AI scaling paradigm by emphasizing a "doing more with less" philosophy, where the next generation of AI success relies on efficient hybrid architectures rather than merely stacking more GPUs [2][3][4]. Group 1: Engram Module and Conditional Memory - DeepSeek's innovative Engram module separates storage from computation, significantly reducing the need for expensive high-bandwidth memory (HBM) by utilizing cost-effective DRAM for complex reasoning tasks [3][9]. - The introduction of "Conditional Memory" allows for efficient retrieval of static knowledge stored in DRAM, enhancing the performance of large language models (LLMs) without overloading HBM [9][12]. Group 2: Economic Impact on Infrastructure - The Engram architecture reshapes the hardware cost structure by minimizing reliance on HBM, suggesting a shift in infrastructure costs from GPUs to more affordable memory solutions [12][13]. - The analysis indicates that a 100 billion parameter Engram model would require approximately 200GB of system DRAM, highlighting a 13% increase in the use of commodity DRAM per system [12][13]. Group 3: Innovation Driven by Constraints - Despite limitations in advanced computing power and hardware access, Chinese AI models have rapidly closed the performance gap with global leaders, demonstrating a shift towards algorithmic efficiency and practical system design [17][18]. - This phenomenon is termed "constraint-induced innovation," indicating that future AI advancements may stem from innovative thinking under resource constraints rather than merely increasing hardware capabilities [17][18]. Group 4: Future Outlook - Predictions for DeepSeek's next-generation model V4 suggest significant advancements in coding and reasoning capabilities, with the potential to run on consumer-grade hardware, thereby lowering the marginal costs of high-level AI inference [20][21]. - The report emphasizes optimism regarding the localization of memory and semiconductor equipment in China, as the decoupling of memory from computation is expected to lead to smarter and more efficient LLMs [21].
大摩眼中的DeepSeek:以存代算、以少胜多!
Hua Er Jie Jian Wen· 2026-01-22 02:48
Core Insights - DeepSeek is revolutionizing AI scalability by utilizing a hybrid architecture that replaces scarce HBM resources with more cost-effective DRAM, focusing on smarter design rather than merely increasing GPU clusters [1][5] Group 1: Technological Innovation - DeepSeek's innovative module, "Engram," separates storage from computation, significantly reducing the need for expensive HBM by employing a "Conditional Memory" mechanism [1][3] - The Engram architecture allows for efficient retrieval of static knowledge stored in DRAM, freeing up HBM for more complex reasoning tasks, thus enhancing overall efficiency [3][5] Group 2: Cost Structure and Economic Impact - The shift from reliance on HBM to DRAM is expected to reshape the hardware cost structure, making AI infrastructure more affordable [5][7] - A 100 billion parameter Engram model requires approximately 200GB of system DRAM, indicating a 13% increase in the use of commercial DRAM per system compared to existing setups [5][7] Group 3: Competitive Landscape - Despite hardware limitations, Chinese AI models have rapidly closed the performance gap with leading global models, demonstrating strong competitive capabilities [6][8] - DeepSeek V3.2 achieved an MMLU score of approximately 88.5% and coding capability of around 72%, showcasing its efficiency in reasoning and performance [6][8] Group 4: Future Outlook - The upcoming DeepSeek V4 model is anticipated to leverage the Engram architecture for significant advancements in coding and reasoning, potentially running on consumer-grade hardware [8] - This development could lower the marginal costs of high-level AI inference, facilitating broader deployment of AI applications without reliance on expensive data center GPUs [8]