Workflow
Memory Optimization
icon
Search documents
DeepSeek新模型MODEL1曝光
Jin Rong Jie· 2026-01-20 23:59
Core Insights - DeepSeek has unveiled its new model "MODEL1" on the first anniversary of DeepSeek-R1, indicating a significant development in its product line [1] - The company updated its FlashMLA code on GitHub, with 28 mentions of MODEL1 across 114 files, suggesting that MODEL1 is a distinct architecture compared to V32, which is identified as DeepSeek-V3.2 [1] - Key differences in the code include KV cache layout, sparsity handling, and FP8 decoding, highlighting various optimizations in memory usage [1] - There are reports that DeepSeek plans to release its next-generation flagship model around mid-February, coinciding with the Chinese New Year [1]