Core Insights - DeepSeek has generated significant buzz in the AI community with the unexpected exposure of a new model named Model1 during a code update, suggesting a potential new technological path distinct from the existing V3 series [1][6][8] - Speculation is rife that DeepSeek is preparing to launch its next-generation AI model, V4, around mid-February, following a year of iterative improvements to the V3 model [3][8] Model Development Timeline - On March 25, 2025, DeepSeek released V3-0324, enhancing code generation usability and surpassing GPT-4.5 in mathematical and coding capabilities [4] - On May 29, 2025, the R1 model underwent a minor upgrade, improving performance in mathematics, programming, and general logic, with hallucination rates reduced by 45-50% [4] - On August 21, 2025, DeepSeek V3.1 was launched, offering faster response times and stronger agent capabilities, along with support for Anthropic's API [4] - On September 22, 2025, the V3.1-Terminus version was released, addressing issues with mixed-language inputs and enhancing the performance of Code and Search Agents [4] - On September 29, 2025, the V3.2-Exp version introduced a new attention mechanism, with updated API pricing structures [4] - On December 1, 2025, the official V3.2 version was released, achieving inference capabilities comparable to GPT-5 and integrating thinking modes for tool usage [4][9] Research Contributions - Two papers authored by Liang Wenfeng were published between late December 2025 and early January 2026, addressing training stability and knowledge retrieval efficiency in large model architectures [5][10] - The first paper proposed a manifold-constrained hyper-connections framework to enhance training stability by constraining residual connections within a specific manifold [10][11] - The second paper introduced a conditional memory module that improves inference and knowledge task performance by decoupling knowledge storage from neural computation [10][11] Market Expectations - The AI community is eagerly anticipating whether DeepSeek will unveil the new Model1 or V4 during the upcoming Spring Festival, with expectations of a significant impact on the global AI landscape [6][8]
传DeepSeek曝新模型,梁文锋再放“王炸”?