Engram模型
Search documents
AI存算分离突破催化DRAM需求,数字经济ETF(560800)盘中涨1.27%
Xin Lang Cai Jing· 2026-01-28 02:44
Group 1 - The semiconductor and chip sectors are experiencing a strong rally, with the China Securities Digital Economy Theme Index rising by 1.23% as of January 28, 2026, and key stocks like Shengbang Co., Ltd. and Zhaoyi Innovation seeing increases of 6.71% and 6.11% respectively [1] - The Digital Economy ETF (560800) also saw a rise of 1.27%, with a trading volume of 539.51 million yuan and a turnover rate of 0.95% [1] - The DeepSeek and Peking University collaboration introduced the Engram model, which separates static knowledge retrieval from complex calculations, enhancing AI infrastructure and providing a scalable technical path for large models in China [1] Group 2 - The AI wave is reshaping the semiconductor supply chain, with significant supply constraints in upstream core components, as noted by Datong Securities [2] - AI-driven chips are prioritizing advanced process capacities at companies like TSMC and Samsung, impacting traditional server CPU production [2] - The Digital Economy ETF closely tracks the China Securities Digital Economy Theme Index, selecting companies with high digital infrastructure and application levels [2] Group 3 - As of December 31, 2025, the top ten weighted stocks in the China Securities Digital Economy Theme Index accounted for 52.63% of the index, including companies like Dongfang Wealth and Cambricon [3] - The Digital Economy ETF has various connection options, including the Pengyang China Securities Digital Economy Theme ETF [3]
DeepSeek发布DeepSeek-OCR 2模型,AI人工智能ETF(512930)开盘上涨
Xin Lang Cai Jing· 2026-01-28 02:02
Group 1 - The core viewpoint of the news highlights the performance of the AI sector, with the CSI Artificial Intelligence Theme Index (930713) rising by 0.64% and notable gains in constituent stocks such as Tongfang Co., Ltd. (up 9.56%) and Ruixin Microelectronics (up 3.69%) [1] - The AI Artificial Intelligence ETF (512930) increased by 0.54%, with the latest price reported at 2.4 yuan [1] - The DeepSeek team released a paper titled "DeepSeek-OCR 2: Visual Causal Flow" and open-sourced the DeepSeek-OCR 2 model, which utilizes an innovative DeepEncoder V2 method to dynamically rearrange image components based on their meanings, aligning more closely with human visual encoding logic [1] - The new generation open-source model Kimi K2.5 was officially released, achieving the best results globally in several agent evaluations, including HLE, BrowseComp, and DeepSearchQA [1] Group 2 - Haitong International pointed out that the Engram model, developed in collaboration with Peking University, utilizes a hash table mechanism to decouple static knowledge retrieval from the backbone network, allowing for memory calls in O(1) time, thus enhancing deep reasoning focus [2] - Empirical evidence shows that Engram-27B systematically outperforms MoE baselines on benchmarks like MMLU and BBH, particularly excelling in long-context tasks [2] - The CSI Artificial Intelligence Theme Index comprises 50 listed companies involved in providing foundational resources, technology, and application support for artificial intelligence, reflecting the overall performance of AI-themed listed securities [2] - As of December 31, 2025, the top ten weighted stocks in the CSI Artificial Intelligence Theme Index include companies like Zhongji Xuchuang and New Yisheng, collectively accounting for 58.08% of the index [2]
DeepSeekEngram:把“回忆”交给查表,把算力留给推理
Haitong Securities International· 2026-01-27 08:50
Investment Rating - The report does not explicitly state an investment rating for the industry or specific companies involved in the research Core Insights - The Engram model proposed by DeepSeek and Peking University introduces a "Conditional Memory" mechanism that separates static knowledge recall from complex computations, significantly improving computational efficiency and task performance [1][2] - Engram-27B demonstrates systematic improvements over MoE-27B across multiple benchmarks, particularly excelling in long-context tasks [1][3] - The architecture allows for the offloading of large parameter tables to host memory, maintaining controllable inference throughput impact, thus validating the feasibility of "separation of storage and computation" [1][6] Summary by Sections Event - In January 2026, DeepSeek and Peking University released a paper on the Engram model, achieving significant performance improvements in various benchmarks while maintaining computational efficiency [1][17] Commentary - Engram innovatively decouples the recall of fixed knowledge from complex model computations, allowing models to focus on deeper reasoning tasks, thus enhancing overall efficiency [2][18] Performance Optimization - The study reveals an optimization path for resource allocation, indicating that transferring some model capacity to a conditional memory module can lead to a "U-shaped" performance trend, with a clear optimal performance range [3][19] - Replacing approximately 20% of traditional parameter capacity with conditional memory can yield significant improvements in knowledge-intensive tasks [3][19] Long Context Processing - Engram effectively offloads local repetitive details to memory lookup, allowing the backbone network to focus on global information integration, which is crucial for long-text processing [4][20] - In experiments, Engram-27B consumed only about 82% of the baseline pre-training computation while achieving higher accuracy in long-text retrieval tasks [4][20] System-Level Design - Engram's deterministic addressing mechanism allows for data pre-fetching from host memory, alleviating pressure on high-bandwidth memory (HBM) and controlling inference overhead to within 3% even with large memory tables [6][22] - The innovation shifts the focus from GPU memory constraints to CPU memory capacity and interconnect technologies, potentially redefining the critical constraints of AI systems [6][23] Impact on Chinese Large Models - Engram's ability to transfer memory-type parameters to scalable system memory enhances model capabilities while reducing reliance on high-end HBM, providing a clearer path for efficiency-driven technological advancement in China's large model industry [7][24] - The open-sourcing of related papers and code lowers barriers for industry validation and development, facilitating faster deployment and commercialization of large models in cost-sensitive environments [7][26]
闪迪暴涨背后:三大催化共振,NAND成“必需品”,AI 重估存储价值
Hua Er Jie Jian Wen· 2026-01-23 03:41
Core Insights - The storage sector is experiencing a "perfect storm," with SanDisk's stock price increasing over 100%, driven by a value reassessment triggered by advancements in AI architecture [1][11] - Storage is transitioning from a cost item to a core production element for AI, as evidenced by developments from NVIDIA and DeepSeek [1][10] Group 1: Technological Developments - NVIDIA's CEO Jensen Huang introduced the concept of Inference Context Memory Storage (ICMS) at CES 2026, highlighting that context is becoming a new bottleneck for AI rather than computing power [2][3] - The new DGX Vera Rubin NVL72 SuperPOD architecture includes dedicated storage racks for inference context, significantly increasing NAND requirements [2][3] - DeepSeek's Engram model allows NAND to be used as slow memory, enabling deterministic memory access and reducing latency issues compared to HBM [4][5][8] Group 2: Market Implications - The global NAND market, with an annual demand of approximately 1.1–1.2 ZB, is expected to see nearly 10% structural growth driven by AI infrastructure rather than traditional consumer electronics [3][11] - NAND's role is evolving from merely cold data storage to being integrated into a tiered memory system, acting as "slow RAM" for AI applications [8][9] - The combination of BlueField DPU and NAND offers a cost-effective solution for long-term memory needs in AI agents, decoupling storage demand from traditional computing resources [9][10] Group 3: Strategic Value of NAND - The strategic value of NAND is being re-evaluated as it becomes indispensable in AI architectures, leading to a potential shift in pricing logic [11] - Analysts suggest that the developments in NAND technology represent a path to achieving more efficient storage-computing collaboration, which may be more cost-effective than merely expanding computing power [8][9][11]
DeepSeek:基于可扩展查找的条件记忆大型语言模型稀疏性的新维度技术,2026报告
欧米伽未来研究所2025· 2026-01-15 00:29
Core Insights - The article discusses a new architecture called "Engram" proposed by a research team from Peking University and DeepSeek-AI, which aims to enhance the capabilities of large language models (LLMs) by introducing a complementary dimension of "conditional memory" alongside existing "mixture of experts" (MoE) models [2][3]. Group 1: Model Architecture and Performance - The core argument of the report is that language modeling involves two distinct sub-tasks: combinatorial reasoning and knowledge retrieval, with the latter often being static and local [3]. - The Engram architecture modernizes the N-gram concept into a "conditional memory" mechanism, allowing for direct retrieval of static embeddings with O(1) time complexity, thus freeing up computational resources for higher-order reasoning tasks [3][4]. - A significant finding is the "sparsity distribution law," which indicates that a balanced allocation of approximately 20% to 25% of sparse parameter budgets to the Engram module can significantly reduce validation loss while maintaining computational costs [4]. Group 2: Efficiency and Scalability - The Engram model (Engram-27B) outperformed a baseline MoE model (MoE-27B) in various knowledge-intensive and logic-intensive tasks, demonstrating its effectiveness in enhancing model intelligence [4][5]. - Engram's deterministic retrieval mechanism allows for the unloading of large models into host memory, significantly reducing the dependency on GPU memory and enabling the deployment of ultra-large models with limited hardware resources [6][7]. - The architecture's ability to utilize a multi-level cache structure based on the Zipfian distribution of natural language knowledge can greatly benefit cloud service providers and enterprises aiming to reduce deployment costs [7]. Group 3: Long Context Processing - Engram shows structural advantages in handling long contexts by directly addressing many local dependencies, thus allowing the Transformer model to focus on capturing global long-range dependencies [8]. - In long-text benchmark tests, Engram-27B demonstrated a significant accuracy improvement from 84.2% to 97.0% in multi-query retrieval tasks, indicating enhanced efficiency and optimized attention allocation [8]. Group 4: Future Implications - The research signifies a shift in the design philosophy of large models from merely increasing computational depth to a dual-sparsity approach that incorporates both computation and memory [9]. - The introduction of conditional memory is expected to become a standard configuration for the next generation of sparse models, providing high performance and low-cost solutions for trillion-parameter models [9].