DeepSeek - filings, earnings calls, financial reports, news

DeepSeek下一代模型

DeepSeek V4路线图隐现？梁文锋署名重磅论文发布，聚焦大模型条件记忆模块

DeepSeek下一代模型

Jin Rong Jie· 2026-01-13 04:38

Core Insights - DeepSeek has released a significant research paper focusing on the conditional memory module for large models, indicating it will be a core modeling primitive in the next generation of sparse large models [1][4] - The upcoming flagship model V4 is expected to be unveiled around the Spring Festival, with the recent research results potentially outlining its core research roadmap [1][4] Summary by Sections Research Findings - The paper titled "Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models" was co-authored by DeepSeek and Peking University, with DeepSeek's founder Liang Wenfeng among the authors [4] - The core insight of the paper is that large models handle two distinct types of tasks: deep dynamic computation for combinatorial reasoning and static knowledge retrieval [4] - Existing Transformer architectures lack a native knowledge retrieval mechanism, leading to inefficient computation when simulating retrieval processes [4] Proposed Solutions - To address these inefficiencies, DeepSeek proposes the use of conditional memory as a supplementary dimension of sparsity, implemented through a module called Engram [5] - The team discovered a "U-shaped scaling law," indicating that a mixed sparse capacity allocation between MoE experts and Engram memory significantly outperforms pure MoE baseline models [5] - The Engram module is designed to optimize the balance between neural computation (MoE) and static memory, allowing for improved efficiency and performance in various domains, including general reasoning, coding, and mathematics [5] Future Developments - DeepSeek plans to release the next-generation flagship model V4 in February, with preliminary internal tests showing its programming capabilities surpass existing top models [6] - The V4 model is anticipated to be a focal point in the industry, especially following the success of the V3 model released at the end of 2024, which outperformed OpenAI's GPT-5 and Google's Gemini 3.0 Pro in several benchmark tests [6]

Seek .(US:SKLTY)

大模型条件记忆模块

U型缩放定律

新华网财经· 2026-01-13 03:52

Core Insights - DeepSeek released a new paper titled "Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models" on the evening of the 12th, co-authored with Peking University, featuring Liang Wenfeng [1] - The paper introduces conditional memory, which significantly enhances model performance in knowledge retrieval, reasoning, coding, and mathematical tasks under equal parameters and computational conditions [1] - DeepSeek has also open-sourced a related memory module called Engram [1]

条件记忆（conditional memory）

条件记忆（conditional memory）

DeepSeek论文上新！下一代大模型实现“记忆分离”，V4不远了？

梁文锋署名，DeepSeek论文上新

Di Yi Cai Jing Zi Xun· 2026-01-13 03:41

Core Insights - DeepSeek has released a new paper focusing on the conditional memory module of large models, suggesting it will be a core modeling primitive in the next generation of sparse large models [2][5][7] Group 1: Research and Development - The new paper, co-authored with Peking University, is titled "Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models" [5] - The research identifies two distinct tasks within large models: deep dynamic computation for combinatorial reasoning and static knowledge retrieval, highlighting inefficiencies in the current Transformer architecture [5][6] - DeepSeek introduces conditional memory as a supplementary sparse dimension to optimize the balance between neural computation (MoE) and static memory (Engram) [6][7] Group 2: Performance and Implications - The team discovered a U-shaped scaling law indicating that the mixed sparse capacity allocation between MoE experts and Engram memory significantly outperforms pure MoE baseline models [6] - The introduction of the memory module not only aids knowledge retrieval but also shows significant improvements in general reasoning, coding, and mathematical tasks [6][7] - The paper essentially proposes a "division of labor" optimization for large models, allowing specialized modules to handle specific tasks more efficiently [6][7] Group 3: Future Developments - Industry speculation suggests that the proposed conditional memory may be part of the technical architecture for DeepSeek's upcoming flagship model, DeepSeek V4, expected to be released around February [7] - Initial tests indicate that V4 may surpass other leading models in programming capabilities, with the previous V3 model having already outperformed OpenAI's GPT-5 and Google's Gemini 3.0 Pro in various benchmarks [7]

Di Yi Cai Jing Zi Xun· 2026-01-13 03:32

Core Insights - DeepSeek has released a new paper focusing on the conditional memory module of large models, suggesting it will be a core modeling primitive in the next generation of sparse large models [1][4]. Group 1: Research Findings - The new paper, co-authored with Peking University, is titled "Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models" and highlights the need for a native knowledge retrieval mechanism in existing Transformer architectures [4]. - The research identifies two distinct tasks in large models: deep dynamic computation for combinatorial reasoning and static knowledge retrieval, indicating that current models inefficiently simulate retrieval processes [4][5]. - DeepSeek introduces conditional memory as a supplementary dimension of sparsity, optimizing the trade-off between mixture of experts (MoE) and static memory (Engram) [4][6]. Group 2: Performance Improvements - The team discovered a U-shaped scaling law, showing that the mixed sparse capacity allocation between MoE experts and Engram memory significantly outperforms pure MoE baseline models [5]. - The introduction of the memory module not only aids knowledge retrieval but also yields notable improvements in general reasoning, coding, and mathematical tasks [5][6]. - The paper essentially proposes a "division of labor" optimization for large models, allowing specialized modules to handle specific tasks, thereby enhancing efficiency and resource allocation [6]. Group 3: Future Developments - Industry speculation suggests that the proposed conditional memory may be integral to the architecture of DeepSeek's upcoming flagship model, DeepSeek V4, expected to be released around February [6]. - Initial tests indicate that V4 may surpass other leading models in programming capabilities, with the previous model, V3, having already outperformed OpenAI's GPT-5 and Google's Gemini 3.0 Pro in various benchmarks [6].

证券时报· 2026-01-13 03:27

Core Viewpoint - DeepSeek released a new paper titled "Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models," which introduces conditional memory to enhance model performance in various tasks under equal parameters and computational conditions [1]. Group 1 - The paper was co-authored by Peking University and DeepSeek, with Liang Wenfeng listed as a co-author [1]. - Conditional memory is proposed to significantly improve model performance in knowledge retrieval, reasoning, coding, and mathematical tasks [1]. - DeepSeek has open-sourced a related memory module called Engram [1].

人工智能

Engram（DeepSeek开源记忆模块）

人工智能

Engram（DeepSeek开源记忆模块）

DeepSeek发布梁文锋署名新论文

Zheng Quan Shi Bao· 2026-01-13 03:02

Core Insights - DeepSeek released a new paper titled "Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models" on the evening of the 12th [1] - The paper was co-authored by Peking University and DeepSeek, with Liang Wenfeng listed as a co-author [1] - The concept of conditional memory is introduced, which significantly enhances model performance in knowledge retrieval, reasoning, coding, and mathematical tasks under equal parameters and computational conditions [1] - DeepSeek has also open-sourced a related memory module named Engram [1] Company and Industry Summary - The collaboration between DeepSeek and Peking University highlights the growing trend of partnerships between academia and industry in advancing AI technologies [1] - The introduction of scalable lookup structures in large language models represents a significant innovation in the field, potentially leading to improved efficiency and effectiveness in AI applications [1] - The open-sourcing of the Engram memory module may encourage further research and development in conditional memory systems, fostering a more collaborative environment in AI advancements [1]

Seek .(US:SKLTY)