DeepSeek新模型来了?

Core Insights - DeepSeek is advancing its new model version with a grayscale test, potentially the final version before the official V4 launch [1] - The V4 model is expected to be released in mid-February 2026, and it will not replicate the global AI computing demand panic seen during the V3 launch [2] - The core value of V4 lies in driving the commercialization of AI applications through underlying architectural innovations rather than disrupting the existing AI value chain [2] Model Enhancements - The context length of the model has been expanded from 128K to 1M, nearly a tenfold increase, and the knowledge base has been updated to May 2025 [1] - V4 is expected to introduce two innovative technologies, mHC and Engram, which aim to overcome computing chip and memory bottlenecks [2][8] - Initial internal tests indicate that V4 outperforms models like Anthropic Claude and OpenAI's GPT series in programming tasks [2] Technical Innovations - mHC (Manifold Constraint Hyperconnection) addresses the bottlenecks in information flow and training instability in deep Transformer models, enhancing the richness and flexibility of communication between neural network layers [4] - Engram is a "conditional memory" module that decouples memory from computation, allowing static knowledge to be stored in a sparse memory table, thus freeing up expensive GPU memory for dynamic calculations [6] Cost Efficiency and Market Impact - The introduction of mHC and Engram is expected to significantly reduce training and inference costs, stimulating downstream application demand and initiating a new cycle of AI infrastructure development [8] - The report suggests that Chinese AI hardware manufacturers may benefit from increased demand and investment due to these cost optimizations [8] Market Dynamics - The market landscape has shifted from a dominant player to a more fragmented competition, with DeepSeek's market share declining as more players enter the field [9][11] - The efficiency in computing management and performance improvements from DeepSeek are accelerating the development of Chinese large language models and applications, altering the global competitive landscape [11] Opportunities for Software Companies - Major global cloud service providers are actively pursuing general artificial intelligence, and the capital expenditure race continues [12] - If V4 can maintain high performance while significantly lowering training and inference costs, it will help developers convert technology into revenue more quickly, alleviating profit pressures [12] - Enhanced capabilities of V4 are expected to create more powerful AI agents, transforming them from mere conversational tools to capable assistants that can handle complex tasks [12]

Seek .-DeepSeek新模型来了? - Reportify