DeepSeek AI新模型曝光:搭载 MODEL1 全新架构,最快2月上线

Core Insights - DeepSeek plans to launch its next-generation flagship AI model, DeepSeek V4, around mid-February during the Lunar New Year, which is expected to significantly enhance coding capabilities and attract industry attention [1][2] Group 1: Model Development - The release of DeepSeek V4 follows the one-year anniversary of the DeepSeek-R1 model, with developers discovering updates related to FlashMLA in 114 files, including 28 references to an unknown "MODEL1" identifier, likely indicating a new AI model with a different architecture [1][2] - The new architecture optimizes key technical aspects such as key-value (KV) cache layout, sparsity handling, and FP8 data format decoding support, addressing memory usage and computational efficiency issues, thereby laying the groundwork for performance improvements [3] Group 2: Research Innovations - DeepSeek's research team has previously published two technical papers introducing innovative training methods like "optimized residual connections (mHC)" and a biologically inspired "AI memory module (Engram)," suggesting that DeepSeek V4 may integrate these latest research findings to enhance its capabilities in handling complex tasks [3]

Seek .-DeepSeek AI新模型曝光:搭载 MODEL1 全新架构,最快2月上线 - Reportify