Seek .(SKLTY)
Search documents
DeepSeek不发V4,六小龙不敢过年
3 6 Ke· 2026-02-12 00:26
Core Insights - DeepSeek is evolving beyond being just a "chatbot" base and is optimizing its large model's energy efficiency through architectural innovations, as evidenced by the recent release of new models and frameworks [1][3] - The competitive landscape is intensifying, with DeepSeek's new models being crucial for maintaining its industry position against major players like Google and OpenAI [1][2] Group 1: Technological Developments - In January 2024, DeepSeek released the Engram architecture, which separates "conditional memory" from "computation," aiming to reduce errors and save computational power [3] - The new model, referred to as MODEL1, is speculated to either be a lightweight model suitable for edge devices or a "long-sequence expert" designed for processing lengthy documents or code [3] - DeepSeek's commitment to cost-effective AI solutions is evident, as it aims to lower token costs, making AI development more accessible to a broader range of developers [4] Group 2: Market Position and Competition - The release of new models is seen as essential for DeepSeek to avoid falling behind competitors like Gemini 3 and GPT-5, which have demonstrated superior performance in various benchmarks [7][8] - Despite DeepSeek's strong position in the open-source community, the company faces pressure from the rapid advancements of closed-source models, which could lead to a loss of developer loyalty [10][11] - The competitive dynamics are shifting, with major internet companies increasing their investments in AI, potentially impacting DeepSeek's market share and the overall landscape for domestic AI companies [13][14] Group 3: Ecosystem and Community Impact - DeepSeek's open-source models, such as DeepSeek-V3 and R1, have gained significant traction, accounting for over half of the open-source token throughput in a short period [8][9] - The company has established a decentralized and pragmatic technical ecosystem, attracting developers interested in self-controlled and private deployments [4][6] - The ongoing developments in the open-source AI community are reshaping the narrative around Chinese AI capabilities, with DeepSeek playing a pivotal role in this transformation [5][6]
国务院国资委推动央企扩大算力有效投资 DeepSeek模型更新
Xin Lang Cai Jing· 2026-02-11 23:57
Market Dynamics - The State Council, led by Premier Li Qiang, emphasized the importance of cultivating new productive forces through the integration of artificial intelligence across various industries, aiming for high-quality development [1] - The State-owned Assets Supervision and Administration Commission (SASAC) urged central enterprises to enhance effective investment in computing power and promote the synergy between computing power and electricity, aiming to strengthen the foundation of the AI industry [1] Company Developments - Meta Platforms Inc. plans to invest over $10 billion to build a data center in Lebanon, Indiana, covering 4 million square feet and expected to be operational by late 2027 or early 2028, creating 300 long-term jobs and supporting over 4,000 construction workers [3] - Shanghai Suiruan Technology Co., Ltd. has changed its IPO review status to "inquired" on the Sci-Tech Innovation Board, focusing on cloud AI chip design since its establishment in 2018 [5] - NetEase reported a stable performance for 2025, with Q4 revenue of 27.5 billion yuan and total annual revenue of 112.6 billion yuan, achieving an operating profit of 35.8 billion yuan, a 21% year-on-year increase [6] - Newray Co., Ltd. plans to acquire a 70% stake in PCB tool company Huilian Electronics for no more than 700 million yuan, aiming to enhance its market competitiveness in the PCB tool sector [8] - Zhongji Xuchuang clarified that CSP customers place orders directly with the company, ensuring no bypassing of the company in the supply chain [7] - Zhongwei Semiconductor plans to establish an IPM production line project in Ziyang, Sichuan, using 121 million yuan of surplus fundraising for this initiative [10]
来了!DeepSeek新模型 | 附体验入口
Xin Lang Cai Jing· 2026-02-11 13:22
Core Insights - DeepSeek has released an updated model, enhancing its capabilities significantly [1][3] Model Enhancements - The context capacity has been upgraded to 1 million tokens from the previous 128,000, allowing for the processing of extensive content such as the entire "Three-Body Problem" trilogy [9][11] - The knowledge base has been updated to May 2025, indicating a new foundational model, potentially referred to as DeepSeek V4 [9][14] Performance Improvements - The frontend and coding capabilities have seen substantial improvements, now comparable to top competitors like Gemini 3 Pro and K2.5 [10][12] - The language style has become more lively and authentic, reducing inaccuracies and enhancing user interaction [10][13] Limitations - The model remains a pure text model and does not support visual understanding, focusing solely on text and voice inputs [14][15]
DeepSeek更新新模型,支持最高1M百万Token上下文长度
Xin Lang Cai Jing· 2026-02-11 11:35
Core Viewpoint - DeepSeek has released a version update that supports a maximum context length of 1 million tokens, but it has not yet enabled multimodal capabilities [1][2]. Group 1: Version Update - The recent update for DeepSeek on both web and app platforms allows for a context length of up to 1 million tokens [1][2]. - As of now, the updated version does not support multimodal capabilities [1][2]. Group 2: Future Developments - Reports suggest that a minor update for the V3 series model is expected to be released around the Spring Festival [1][2]. - The next flagship model from DeepSeek is anticipated to be a trillion-parameter foundational model, but the significant increase in scale has slowed down the training speed, causing delays in the release process [1][2].
DeepSeek疑似已更新:上下文暴增至100万,知识库
Guan Cha Zhe Wang· 2026-02-11 11:24
Group 1 - The core point of the article highlights the significant updates in the DeepSeek AI model, particularly its context processing capabilities and knowledge base freshness [1][3] - After updating to version 1.7.4, DeepSeek claims a context processing capacity of 1M, capable of handling the entire "Three-Body" trilogy [1] - The latest version, DeepSeek V3.2, released on December 1, 2025, shows an 8-fold increase in context capability to 128K compared to the previous version [3] Group 2 - The knowledge base of DeepSeek has been updated to reflect information up to May 2025, improving its relevance to significant events and technological advancements expected in late 2024 and early 2025 [3] - Currently, DeepSeek does not support multimodal capabilities, indicating a potential area for future development [3] - The official DeepSeek team has not made any public announcements or responses regarding these updates [3]
DeepSeek新模型来了?
Hua Er Jie Jian Wen· 2026-02-11 11:21
Core Insights - DeepSeek is advancing its new model version with a grayscale test, potentially the final version before the official V4 launch [1] - The V4 model is expected to be released in mid-February 2026, and it will not replicate the global AI computing demand panic seen during the V3 launch [2] - The core value of V4 lies in driving the commercialization of AI applications through underlying architectural innovations rather than disrupting the existing AI value chain [2] Model Enhancements - The context length of the model has been expanded from 128K to 1M, nearly a tenfold increase, and the knowledge base has been updated to May 2025 [1] - V4 is expected to introduce two innovative technologies, mHC and Engram, which aim to overcome computing chip and memory bottlenecks [2][8] - Initial internal tests indicate that V4 outperforms models like Anthropic Claude and OpenAI's GPT series in programming tasks [2] Technical Innovations - mHC (Manifold Constraint Hyperconnection) addresses the bottlenecks in information flow and training instability in deep Transformer models, enhancing the richness and flexibility of communication between neural network layers [4] - Engram is a "conditional memory" module that decouples memory from computation, allowing static knowledge to be stored in a sparse memory table, thus freeing up expensive GPU memory for dynamic calculations [6] Cost Efficiency and Market Impact - The introduction of mHC and Engram is expected to significantly reduce training and inference costs, stimulating downstream application demand and initiating a new cycle of AI infrastructure development [8] - The report suggests that Chinese AI hardware manufacturers may benefit from increased demand and investment due to these cost optimizations [8] Market Dynamics - The market landscape has shifted from a dominant player to a more fragmented competition, with DeepSeek's market share declining as more players enter the field [9][11] - The efficiency in computing management and performance improvements from DeepSeek are accelerating the development of Chinese large language models and applications, altering the global competitive landscape [11] Opportunities for Software Companies - Major global cloud service providers are actively pursuing general artificial intelligence, and the capital expenditure race continues [12] - If V4 can maintain high performance while significantly lowering training and inference costs, it will help developers convert technology into revenue more quickly, alleviating profit pressures [12] - Enhanced capabilities of V4 are expected to create more powerful AI agents, transforming them from mere conversational tools to capable assistants that can handle complex tasks [12]
DeepSeek更新新模型 可一次性处理超长文本
Xin Lang Cai Jing· 2026-02-11 11:13
Core Insights - DeepSeek has updated its web and app versions to support a maximum context length of 1 million tokens, significantly enhancing its ability to process long texts [1][2] - The previous version, DeepSeek V3.1, had a context length of 128,000 tokens, indicating a substantial improvement in the latest update [1] - DeepSeek successfully processed a document of over 240,000 tokens, demonstrating its capability to recognize and handle extensive content [2] - There are indications that a minor update for the V3 series was expected around the Spring Festival, but the major advancements are still forthcoming [2] - The next flagship model from DeepSeek is anticipated to be a trillion-parameter foundational model, although the increase in scale has slowed down the training speed and delayed the release timeline [2]
DeepSeek突然测试新模型,春节大招要来了?
Feng Huang Wang· 2026-02-11 10:52
Core Insights - The recent upgrade of DeepSeek does not include multimodal visual understanding capabilities, focusing instead on pure text and voice interaction paths [2] - The core context window has been increased from 128K to 1M tokens, allowing the model to process long texts equivalent to the "Three-Body Problem" trilogy in a single instance, positioning it against international competitors like GPT-5 and Gemini3Pro [2] - The knowledge base of the current model has been updated to include accurate outputs for news events as far ahead as April 2025, with the cutoff date for knowledge now set to May 2025 [2] User Experience and Development - Feedback from developers and early users indicates that the new model's language style has become "enthusiastic and nuanced," with front-end response quality rated as comparable to Claude3.5Sonnet, suggesting a focus on enhancing user interaction experience [5] - The company has been actively hiring for multiple technical core positions, including deep learning researchers and engineers, indicating a commitment to advancing its large language model (LLM) capabilities [5] - There is speculation within the industry that the current version may correspond to the rumored "DeepSeek V4" or an enhanced version of the V3.2 series, although the official version name has not yet been disclosed [5]
DeepSeek突然测试新模型,上下文已到百万级
Feng Huang Wang· 2026-02-11 10:37
Core Insights - DeepSeek has initiated a key update with a significant enhancement in its model architecture, moving from a context window of 128K to 1M tokens, which allows for processing longer texts comparable to international products like GPT-5 and Gemini3Pro [1] - The model's knowledge base has been updated to include information up to May 2025, and it can accurately output news events as far ahead as April 2025 [1] - User feedback indicates that the new model exhibits a more "enthusiastic and nuanced" language style, enhancing the user interaction experience [1] Group 1 - DeepSeek has begun gray testing for its updated model on both web and app platforms [1] - The new model's context window allows it to handle the entire "Three-Body" trilogy in a single processing instance [1] - The upgrade does not include multimodal visual understanding capabilities, focusing instead on text and voice interactions [1] Group 2 - DeepSeek has been actively hiring for multiple core technical positions, including deep learning researchers and engineers, indicating a focus on advancing its large language model (LLM) capabilities [2] - The company is open to various recruitment channels, including campus recruitment and internships, to fill these positions [2] - There is speculation that the current version being tested may correspond to the previously rumored "DeepSeek V4" or an enhanced version of V3.2 [2]
华为云“码道”代码智能体开启公测,支持 GLM-4.7 和 DeepSeek-V3.2
Xin Lang Cai Jing· 2026-02-11 10:32
Core Insights - Huawei Cloud officially launched "CodeArts," an AI-powered coding assistant, in January 2023, which integrates IDE, autonomous development mode, and code library indexing capabilities, currently in public beta for 10,000 users [1][8] - The personal version of "CodeArts" is available for free to developers, while the enterprise version will be announced later [1][8] - The product utilizes GLM-4.7 and DeepSeek-V3.2 models and supports JetBrains series and Visual Studio Code IDEs [1][8] Product Features - "CodeArts" combines essential programming capabilities such as project-level code generation, code continuation, research knowledge Q&A, and unit test case generation, significantly enhancing developer productivity and providing a high-quality coding experience [2][9] - The tool allows users to input requirements, enabling the AI to generate code directly [3][11] Copyright and Usage - Huawei Cloud states that the copyright of the code generated by the AI belongs to the user, emphasizing that "CodeArts" functions as a tool that responds to user inputs without creative autonomy [5][13]