DeepSeek新模型曝光?“MODEL1”现身开源社区

Core Insights - DeepSeek has updated its FlashMLA code on GitHub, revealing the previously undisclosed "MODEL1" identifier, which may indicate a new model distinct from the existing "V32" [3][4] - The company plans to launch an "open source week" in February 2025, gradually releasing five codebases, with Flash MLA being the first project [4] - Flash MLA optimizes memory access and computation processes on Hopper GPUs, significantly enhancing the efficiency of variable-length sequence processing, particularly for large language model inference tasks [4] Company Developments - DeepSeek's upcoming AI model, DeepSeek V4, is expected to be released around the Lunar New Year in February 2025, although the timeline may vary [4] - The V4 model is an iteration of the V3 model released in December 2024, boasting advanced programming capabilities that surpass current leading models like Anthropic's Claude and OpenAI's GPT series [5] - Since January 2026, DeepSeek has published two technical papers introducing a new training method called "optimized residual connections (mHC)" and a biologically inspired "AI memory module (Engram)" [5] Industry Context - The introduction of the Engram module aims to improve knowledge retrieval and general reasoning, addressing inefficiencies in the Transformer architecture [5] - The support from Liang Wenfeng's private equity firm, which has achieved a 56.55% average return in 2025, has bolstered DeepSeek's research and development efforts [5]

Seek .-DeepSeek新模型曝光?“MODEL1”现身开源社区 - Reportify