Engram模型
Search documents
AI存算分离突破催化DRAM需求,数字经济ETF(560800)盘中涨1.27%
Xin Lang Cai Jing· 2026-01-28 02:44
2026年1月28日早盘,半导体、芯片、晶圆产业板块拉涨,截至09:56,中证数字经济主题指数强势上涨 1.23%,成分股圣邦股份上涨6.71%,兆易创新上涨6.11%,士兰微上涨5.81%,豪威集团,盛美上海等 个股跟涨。数字经济ETF(560800)上涨1.27%。流动性方面,数字经济ETF盘中换手0.95%,成交539.51 万元。拉长时间看,截至1月27日,数字经济ETF近1月日均成交2255.68万元。(文中所列示股票为指 数成份股,仅做示意不作为个股推荐。过往持仓情况不代表基金未来的投资方向,也不代表具体的投资 建议,投资方向、基金具体持仓可能发生变化,投资需谨慎) 风险提示:"中证数字经济主题指数(931582)由中证指数有限公司("中证")编制和计算,其所有权归属 中证及/或其指定的第三方。中证对于标的指数的实时性、准确性、完整性和特殊目的的适用性不作任 何明示或暗示的担保,不因标的指数的任何延迟、缺失或错误对任何人承担责任(无论是否存在过 失)。中证对于跟踪标的指数的产品不作任何担保、背书、销售或推广,中证不承担与此相关的任何责 任。"本基金为被动投资的交易型开放式指数基金,主要采用完全复制 ...
DeepSeek发布DeepSeek-OCR 2模型,AI人工智能ETF(512930)开盘上涨
Xin Lang Cai Jing· 2026-01-28 02:02
数据显示,截至2025年12月31日,中证人工智能主题指数(930713)前十大权重股分别为中际旭创、新易 盛、寒武纪、澜起科技、中科曙光、科大讯飞、海康威视、豪威集团、金山办公、浪潮信息,前十大权 重股合计占比58.08%。 AI人工智能ETF(512930),场外联接(平安中证人工智能主题ETF发起式联接A:023384;平安中证人工 智能主题ETF发起式联接C:023385;平安中证人工智能主题ETF发起式联接E:024610)。 风险提示:基金有风险,投资需谨慎。基金管理人承诺以诚实信用、勤勉尽责的原则管理和运用基金资 产,但不保证本基金一定盈利,也不保证最低收益。基金管理人提醒投资人基金投资的"买者自负"原 则,在做出投资决策后,基金运营状况与基金净值变化引致的投资风险,由投资人自行负担。基金的过 往业绩及其净值高低并不预示其未来业绩表现,基金管理人管理的其他基金的业绩不构成对本基金业绩 表现的保证。投资人购买基金,既可能按其持有份额分享基金投资所产生的收益,也可能承担基金投资 所带来的损失。投资人应当认真阅读《基金合同》《招募说明书》等基金法律文件,全面认识本基金的 风险收益特征和产品特性,并根据 ...
DeepSeekEngram:把“回忆”交给查表,把算力留给推理
Haitong Securities International· 2026-01-27 08:50
wo[Table_Title] Research Report 27 Jan 2026 中国电子 China (Overseas) Technology DeepSeek Engram:把"回忆"交给查表,把算力留给推理 DeepSeek Engram: Delegate Recall to Lookups, Reserve Compute for Reasoning 姚书桥 Barney Yao 吕小潼 Xiaotong Lyu barney.sq.yao@htisec.com xt.lyu@htisec.com [Table_header1] 中国电子 AI 基础设施"瓶颈位置"可能从 HBM 进一步外溢到 DRAM/互联/存储。Engram 的系统层设计通过其确定性寻址机制,实 现了在 GPU 进行计算的同时对主机内存进行数据预取,从而将海量静态参数从昂贵的高带宽内存(HBM/显存)中剥 离,显著缓解了显存容量压力。论文的定性结论指出,即使将规模达 100B 的记忆参数表卸载至主机内存,其带来的额 外推理开销也可控制在 3%以内。从基础设施成本结构的视角分析,我们认为该技术路径的影响可能主要体现在以下三 ...
闪迪暴涨背后:三大催化共振,NAND成“必需品”,AI 重估存储价值
Hua Er Jie Jian Wen· 2026-01-23 03:41
Core Insights - The storage sector is experiencing a "perfect storm," with SanDisk's stock price increasing over 100%, driven by a value reassessment triggered by advancements in AI architecture [1][11] - Storage is transitioning from a cost item to a core production element for AI, as evidenced by developments from NVIDIA and DeepSeek [1][10] Group 1: Technological Developments - NVIDIA's CEO Jensen Huang introduced the concept of Inference Context Memory Storage (ICMS) at CES 2026, highlighting that context is becoming a new bottleneck for AI rather than computing power [2][3] - The new DGX Vera Rubin NVL72 SuperPOD architecture includes dedicated storage racks for inference context, significantly increasing NAND requirements [2][3] - DeepSeek's Engram model allows NAND to be used as slow memory, enabling deterministic memory access and reducing latency issues compared to HBM [4][5][8] Group 2: Market Implications - The global NAND market, with an annual demand of approximately 1.1–1.2 ZB, is expected to see nearly 10% structural growth driven by AI infrastructure rather than traditional consumer electronics [3][11] - NAND's role is evolving from merely cold data storage to being integrated into a tiered memory system, acting as "slow RAM" for AI applications [8][9] - The combination of BlueField DPU and NAND offers a cost-effective solution for long-term memory needs in AI agents, decoupling storage demand from traditional computing resources [9][10] Group 3: Strategic Value of NAND - The strategic value of NAND is being re-evaluated as it becomes indispensable in AI architectures, leading to a potential shift in pricing logic [11] - Analysts suggest that the developments in NAND technology represent a path to achieving more efficient storage-computing collaboration, which may be more cost-effective than merely expanding computing power [8][9][11]
DeepSeek:基于可扩展查找的条件记忆大型语言模型稀疏性的新维度技术,2026报告
欧米伽未来研究所2025· 2026-01-15 00:29
Core Insights - The article discusses a new architecture called "Engram" proposed by a research team from Peking University and DeepSeek-AI, which aims to enhance the capabilities of large language models (LLMs) by introducing a complementary dimension of "conditional memory" alongside existing "mixture of experts" (MoE) models [2][3]. Group 1: Model Architecture and Performance - The core argument of the report is that language modeling involves two distinct sub-tasks: combinatorial reasoning and knowledge retrieval, with the latter often being static and local [3]. - The Engram architecture modernizes the N-gram concept into a "conditional memory" mechanism, allowing for direct retrieval of static embeddings with O(1) time complexity, thus freeing up computational resources for higher-order reasoning tasks [3][4]. - A significant finding is the "sparsity distribution law," which indicates that a balanced allocation of approximately 20% to 25% of sparse parameter budgets to the Engram module can significantly reduce validation loss while maintaining computational costs [4]. Group 2: Efficiency and Scalability - The Engram model (Engram-27B) outperformed a baseline MoE model (MoE-27B) in various knowledge-intensive and logic-intensive tasks, demonstrating its effectiveness in enhancing model intelligence [4][5]. - Engram's deterministic retrieval mechanism allows for the unloading of large models into host memory, significantly reducing the dependency on GPU memory and enabling the deployment of ultra-large models with limited hardware resources [6][7]. - The architecture's ability to utilize a multi-level cache structure based on the Zipfian distribution of natural language knowledge can greatly benefit cloud service providers and enterprises aiming to reduce deployment costs [7]. Group 3: Long Context Processing - Engram shows structural advantages in handling long contexts by directly addressing many local dependencies, thus allowing the Transformer model to focus on capturing global long-range dependencies [8]. - In long-text benchmark tests, Engram-27B demonstrated a significant accuracy improvement from 84.2% to 97.0% in multi-query retrieval tasks, indicating enhanced efficiency and optimized attention allocation [8]. Group 4: Future Implications - The research signifies a shift in the design philosophy of large models from merely increasing computational depth to a dual-sparsity approach that incorporates both computation and memory [9]. - The introduction of conditional memory is expected to become a standard configuration for the next generation of sparse models, providing high performance and low-cost solutions for trillion-parameter models [9].