DeepSeek旗舰模型 - filings, earnings calls, financial reports, news

DeepSeek旗舰模型

Search documents

Mei Ri Jing Ji Xin Wen· 2026-02-12 22:23

Core Insights - DeepSeek has initiated a gray testing phase for its flagship model, allowing for a context length of up to 1 million tokens, significantly expanding from the previous 128K tokens in version V3.1 released in August last year [1] - Users have reported mixed reactions to the recent updates, with some expressing dissatisfaction over the model's change in tone and interaction style, leading to a trending topic on social media regarding its perceived coldness [1][4] Group 1: Model Updates and Features - The latest version of DeepSeek supports the processing of extremely long texts, as demonstrated by its ability to handle a document with over 240,000 tokens [1] - The upcoming DeepSeek V4 model is expected to be released in mid-February 2026, with the current version being a speed-optimized variant that sacrifices some quality for performance testing [6] - DeepSeek's V series models are designed for optimal performance, with V3 marking a significant milestone due to its efficient MoE architecture [6] Group 2: User Feedback and Reactions - Users have criticized the new version for its impersonal approach, referring to users as "users" instead of personalized nicknames, which has led to a perception of the model being less engaging [4] - Some users have described the updated model as overly simplistic and lacking emotional depth, comparing its output unfavorably to older literary styles [4] - Conversely, a segment of users appreciates the model's newfound objectivity and rationality, noting that it appears more attuned to the psychological state of the questioner [5] Group 3: Technical Innovations - DeepSeek has introduced two innovative architectures: mHC for optimizing information flow in deep Transformers, enhancing stability and scalability without increasing computational load, and Engram for decoupling static knowledge from dynamic computation [7] - These innovations aim to significantly reduce the cost of long-context reasoning while maintaining performance [7]