查—算分离机制
Search documents
刚刚,DeepSeek 突发梁文峰署名新论文:V4 新架构提前曝光?
AI前线· 2026-01-12 22:41
Core Insights - DeepSeek has released a significant technological achievement by open-sourcing a new paper and module called Engram, which introduces a "lookup-computation separation" mechanism to enhance the performance of large language models in various tasks [2][5]. Summary by Sections Introduction of Engram - Engram is a scalable, lookup-based memory module designed to improve the efficiency of language models by separating memory retrieval from computational tasks [10][18]. Need for Engram - Traditional large language models rely on Transformer and Mixture-of-Experts (MoE) architectures, which combine memory and computation in a way that can lead to inefficiencies. Engram aims to address this by allowing models to handle factual memory and logical reasoning separately [8][9]. Core Technology of Engram - Engram utilizes modernized hashed N-gram embeddings, allowing for O(1) time complexity in memory retrieval, which significantly reduces computational costs while maintaining high retrieval speed [11][13]. Relationship with MoE - Engram provides a new axis of sparsity that complements MoE by offering static memory retrieval capabilities, thus optimizing parameter efficiency. In a 27 billion parameter model, Engram can utilize a large number of parameters for memory while consuming minimal computational resources during inference [15][16]. Performance Metrics - Engram has shown improved performance metrics across various benchmarks, such as achieving a loss of 1.950 on the Pile dataset and an accuracy of 60.4% on MMLU with 5-shot learning, outperforming both Dense and MoE models [17]. Community Reception - The Engram technology has received positive feedback from the community, with users highlighting its potential to separate memory pattern retrieval from neural computation, marking a new direction in model architecture design [18][19][21]. Future Implications - Observers speculate that Engram will be a core component of DeepSeek's upcoming V4 model, indicating a significant architectural advancement in memory and reasoning collaboration [22][23].