开源模型
Search documents
DeepSeek杀出一条血路:国产大模型突围不靠运气
3 6 Ke· 2025-12-03 03:21
进入2025年末,全球大模型赛道的技术焦点几乎被Google重新夺回。Gemini 3 Pro横空出世,在多个权 威基准上超越所有开源模型,重新确立了闭源阵营的技术高地。一时间,业内关于"开源模型是否已到 极限""Scaling Law是否真的撞墙"的质疑声再起,一股迟滞情绪在开源社区弥漫。 但就在此时,DeepSeek没有选择沉默。12月1日,它一口气发布了两款重磅模型:推理性能对标GPT-5 的DeepSeek-V3.2,以及在数学、逻辑和多轮工具调用中表现异常强势的Speciale版本。这不仅是对技术 能力的集中展示,也是在当前算力资源并不占优的前提下,对闭源"新天花板"的正面回应。 这不是一次简单的模型更新。DeepSeek试图在后Scaling时代找出一条全新路径:如何用架构重塑弥补 预训练差距?如何通过"工具使用中的思考链"实现低token高效率的智能体表现?更关键的是,Agent为 何从附属功能变成了模型能力跃迁的核心引擎? 本文将围绕这三条主线展开分析:DeepSeek是如何在技术瓶颈下突破的?为何率先在开源阵营中重注 Agent?而这是否意味着,开源模型仍有穿透闭源护城河的那条路? 这背后的 ...
开源最强!“拳打GPT 5”,“脚踢Gemini-3.0”,DeepSeek V3.2为何提升这么多?
华尔街见闻· 2025-12-02 04:21
Core Insights - DeepSeek has released two official models, DeepSeek-V3.2 and DeepSeek-V3.2-Speciale, with the former achieving performance levels comparable to GPT-5 and the latter winning gold medals in four international competitions [1][3]. Model Performance - DeepSeek-V3.2 has reached the highest level of tool invocation capabilities among current open-source models, significantly narrowing the gap with closed-source models [2]. - In various benchmark tests, DeepSeek-V3.2 achieved a 93.1% pass rate in AIME 2025, closely trailing GPT-5's 94.6% and Gemini-3.0-Pro's 95.0% [20]. Training Strategy - The model's significant improvement is attributed to a fundamental change in training strategy, moving from a simple "direct tool invocation" to a more sophisticated "thinking + tool invocation" mechanism [9][11]. - DeepSeek has constructed a new large-scale data synthesis pipeline, generating over 1,800 environments and 85,000 complex instructions specifically for reinforcement learning [12]. Architectural Innovations - The introduction of the DeepSeek Sparse Attention (DSA) mechanism has effectively addressed efficiency bottlenecks in traditional attention mechanisms, reducing complexity from O(L²) to O(Lk) while maintaining model performance [6][7]. - The model's architecture allows for better context management, retaining relevant reasoning content during tool-related messages, thus avoiding inefficient repeated reasoning [14]. Competitive Landscape - The release of DeepSeek-V3.2 signals a shift in the competitive landscape, indicating that the absolute technical monopoly of closed-source models is being challenged by open-source models gaining first-tier competitiveness [20][22]. - This development has three implications: lower costs and greater customization for developers, reduced reliance on overseas APIs for enterprises, and a shift in the industry focus from "who has the largest parameters" to "who has the strongest methods" [22].
DeepSeek又上新!模型硬刚谷歌 承认开源与闭源差距拉大
Di Yi Cai Jing· 2025-12-01 23:13
Core Insights - DeepSeek has launched two new models, DeepSeek-V3.2 and DeepSeek-V3.2-Speciale, which are positioned to compete with leading proprietary models like GPT-5 and Gemini 3.0, showcasing significant advancements in reasoning capabilities [1][4]. Model Overview - DeepSeek-V3.2 aims to balance reasoning ability and output length, making it suitable for everyday applications such as Q&A and general intelligence tasks. It has achieved performance levels comparable to GPT-5 and is slightly below Google's Gemini 3 Pro in public reasoning tests [4]. - DeepSeek-V3.2-Speciale is designed to push the limits of reasoning capabilities, integrating enhanced long-thinking features and theorem-proving abilities from DeepSeek-Math-V2. It has surpassed Gemini 3 Pro in several reasoning benchmarks, including prestigious math competitions [4][5]. Benchmark Performance - In various benchmarks, DeepSeek models have shown competitive results: - AIME 2025: DeepSeek-V3.2 scored 93.1, while GPT-5 and Gemini-3.0 scored 94.6 and 95.0 respectively [5]. - Harvard MIT Math Competition: DeepSeek-V3.2-Speciale scored 92.5, outperforming Gemini 3 Pro's 97.5 [5]. - International Math Olympiad: DeepSeek-V3.2-Speciale scored 78.3, close to Gemini 3 Pro's 83.3 [5]. Limitations and Future Plans - Despite these achievements, DeepSeek acknowledges limitations compared to proprietary models, including narrower world knowledge and lower token efficiency. The team plans to enhance pre-training and optimize reasoning chains to improve model performance [6][7]. - DeepSeek has identified three key areas where open-source models lag behind proprietary ones: reliance on standard attention mechanisms, insufficient computational resources during post-training, and gaps in generalization and instruction-following capabilities [7]. Technological Innovations - DeepSeek has introduced a sparse attention mechanism (DSA) to reduce computational complexity without sacrificing long-context performance. This innovation has been integrated into the new models, contributing to significant performance improvements [7]. Availability - The official website, app, and API for DeepSeek-V3.2 have been updated, while the enhanced Speciale version is currently available only through a temporary API for community evaluation [8]. Community Reception - The release has been positively received in social media, with users noting that DeepSeek's models have effectively matched the capabilities of GPT-5 and Gemini 3 Pro, highlighting the importance of rigorous engineering design over sheer parameter size [9].
开源最强!“拳打GPT 5”,“脚踢Gemini-3.0”,DeepSeek V3.2为何提升这么多?
美股IPO· 2025-12-01 22:29
V3.2在工具调用能力上达到当前开源模型最高水平,大幅缩小了开源模型与闭源模型的差距。作为DeepSeek首个将思考融入工具使用的模型,V3.2 在"思考模式"下仍然支持工具调用。公司通过大规模Agent训练数据合成方法,构造了1800多个环境、85000多条复杂指令的强化学习任务,大幅提升 了模型在智能体评测中的表现。 在大模型赛道逐渐从"参数竞赛"走向"能力竞赛"的当下,一个显著的变化正在发生:开源模型开始在越来越多关键能力维度上逼近、甚至冲击顶级闭源 模型。 12月1日,DeepSeek同步发布两款正式版模型—— DeepSeek-V3.2 与 DeepSeek-V3.2-Speciale ,前者在推理测试中达到GPT-5水平,仅略低于 Gemini-3.0-Pro,而后者在IMO 2025等四项国际顶级竞赛中斩获金牌。 V3.2在工具调用能力上达到当前开源模型最高水平,大幅缩小了开源模型与闭源模型的差距。 据官方介绍, V3.2是DeepSeek首个将思考融入工具使用的模型,在"思考模式"下仍然支持工具调用。该公司通过大规模Agent训练数据合成方法,构 造了1800多个环境、85000多条复杂指令的 ...
DeepSeek 重要发布
Shang Hai Zheng Quan Bao· 2025-12-01 13:57
Core Insights - DeepSeek has officially released two models: DeepSeek-V3.2 and DeepSeek-V3.2-Speciale, with updates available on the official website, app, and API [1] - DeepSeek-V3.2 aims to balance reasoning capabilities and output length, making it suitable for everyday use cases such as Q&A and general agent tasks [1] - DeepSeek-V3.2-Speciale is designed to push the reasoning capabilities of open-source models to the limit, enhancing long-thinking abilities and incorporating theorem-proving capabilities from DeepSeek-Math-V2 [1] Model Performance - The V3.2-Speciale model exhibits excellent instruction-following, rigorous mathematical proof, and logical verification capabilities, performing comparably to leading international models on mainstream reasoning benchmarks [1] - Notably, the V3.2-Speciale model has achieved gold medals in several prestigious competitions, including IMO 2025, CMO 2025, ICPC World Finals 2025, and IOI 2025 [1] - In the ICPC and IOI competitions, the model's performance reached the level of the second and tenth place among human competitors, respectively [1]
DeepSeek又上新!模型硬刚谷歌,承认开源与闭源差距拉大
Di Yi Cai Jing· 2025-12-01 13:31
Core Insights - DeepSeek has launched two new models, DeepSeek-V3.2 and DeepSeek-V3.2-Speciale, which are leading in reasoning capabilities globally [1][3]. Model Overview - DeepSeek-V3.2 aims to balance reasoning ability and output length, suitable for everyday use such as Q&A and general intelligence tasks. It has reached the level of GPT-5 in public reasoning tests, slightly below Google's Gemini3 Pro [3]. - DeepSeek-V3.2-Speciale is designed to push the reasoning capabilities of open-source models to the extreme, combining features from DeepSeek-Math-V2 for theorem proving, and excels in instruction following and logical verification [3][4]. Performance Metrics - Speciale has surpassed Google's Gemini3 Pro in several reasoning benchmark tests, including the American Mathematics Invitational, Harvard MIT Mathematics Competition, and International Mathematical Olympiad [4]. - In various benchmarks, DeepSeek's performance is competitive, with specific scores noted in a comparative table against GPT-5 and Gemini-3.0 [5]. Technical Limitations - Despite achievements, DeepSeek acknowledges limitations compared to proprietary models like Gemini3 Pro, particularly in knowledge breadth and token efficiency [6]. - The company plans to enhance pre-training computation and optimize reasoning chains to improve model efficiency and capabilities [6][7]. Mechanism Innovations - DeepSeek introduced a Sparse Attention Mechanism (DSA) to reduce computational complexity, which has proven effective in enhancing performance without sacrificing long-context capabilities [7][8]. - Both new models incorporate this mechanism, making DeepSeek-V3.2 a cost-effective alternative that narrows the performance gap with proprietary models [8]. Community Reception - The release has been positively received in the community, with users noting that DeepSeek's models are now comparable to GPT-5 and Gemini3 Pro, marking a significant achievement in open-source model development [8].
DeepSeek发布两个正式版模型
Zheng Quan Shi Bao Wang· 2025-12-01 11:18
Core Insights - DeepSeek has released two official model versions: DeepSeek-V3.2 and DeepSeek-V3.2-Speciale [1] - The main goal of DeepSeek-V3.2 is to balance reasoning capability with output length, making it suitable for everyday use cases such as Q&A and general agent tasks [1] - The DeepSeek-V3.2-Speciale version aims to push the reasoning capabilities of open-source models to the extreme, exploring the boundaries of model capabilities [1] Summary by Categories - **Product Launch** - DeepSeek has updated its official website, app, and API to the official version of DeepSeek-V3.2 [1] - The Speciale version is currently available only as a temporary API service for community evaluation and research [1] - **Model Objectives** - DeepSeek-V3.2 is designed for daily applications, focusing on practical scenarios like Q&A and general agent tasks [1] - DeepSeek-V3.2-Speciale is focused on maximizing the reasoning capabilities of the model, aiming to explore its limits [1]
美媒:越来越多硅谷企业正依托中国开源AI模型进行开发,“中国人是AI领域真正创新者”
Huan Qiu Wang· 2025-12-01 02:47
报道称,理论物理学家和机器学习工程师拉斯金,曾参与创建美国谷歌公司部分最强大的AI模型。他 发现,美国AI企业正更多采纳免费、可定制且功能日益强大的开源AI模型,其大多产自中国,并正迅 速赶超美国竞争对手。拉斯金说,这些模型与前沿技术的距离之近"令人惊讶",而如今正在涌现的产品 已显然"非常接近"前沿技术。 【环球网报道 记者 李梓瑜】据美国全国广播公司(NBC)11月30日报道,越来越多美国硅谷企业正依 托中国开源人工智能(AI)模型进行开发。美国艾伦人工智能研究所机器学习研究员纳坦·兰伯特坦 言,"中国人是AI领域真正的创新者"。 NBC称,这种日益增长的接受度可能给美国AI产业带来问题。投资者已向美国开放人工智能研究中心 (OpenAI)和Anthropic公司投入数百亿美元,押注美国领先的AI企业将主导全球AI市场,但美国企业 越来越多使用中国免费模型,引发人们对"美国追求闭源模型的做法是否完全错误"的质疑。 报道补充说,除了性能提升、隐私性增强和成本降低外,开源模型还凭借其生态系统优势不断增加影响 力,而许多中国企业推出新产品的速度也比美国同行更快。纳坦·兰伯特表示,中国模型近期取得的进 步并非偶然 ...
“力量平衡变了,中国AI愈发成为硅谷技术基石”
Guan Cha Zhe Wang· 2025-12-01 00:19
Core Viewpoint - The article discusses the increasing adoption of Chinese open-source AI models by Silicon Valley startups, highlighting their competitive advantages over traditional closed-source models from American companies like OpenAI and Anthropic. This shift raises questions about the sustainability of the closed-source model approach in the U.S. AI industry [1][4][10]. Group 1: Adoption of Chinese AI Models - Many U.S. AI startups are increasingly utilizing Chinese open-source AI models due to their lower costs, higher customization, and strong privacy protection, with some models performing comparably to leading American models [1][4][6]. - Reflection AI, a startup founded by Misha Laskin, aims to provide American alternatives to these high-performance Chinese models, reflecting a growing trend in the industry [2][4]. - The acceptance of Chinese models is seen as a potential challenge to the U.S. AI industry, as investors have heavily backed American companies, raising doubts about the actual advantages of U.S. models [4][10]. Group 2: Performance and Cost Efficiency - Chinese models like DeepSeek and Alibaba's Tongyi Qianwen have made significant technological advancements, closing the performance gap with American closed-source models [5][9]. - Companies like Exa have reported that running Chinese models on their own hardware can be faster and cheaper than using models from OpenAI or Google [4][5]. - The cost-effectiveness of open-source models is crucial for startups, with some users preferring local processing for privacy reasons, further driving the adoption of Chinese models [6][7]. Group 3: Ecosystem and Community Support - The growing ecosystem around Chinese open-source models is attracting more developers, as these models are often accompanied by extensive training resources and community support [7][8]. - Platforms like Kilo Code show a preference for Chinese models among developers, indicating a shift in the default starting point for model customization [8][9]. - The rapid release cycle of Chinese models, with Alibaba launching new models approximately every 20 days, contrasts with the slower pace of American companies, highlighting a competitive edge [9][10]. Group 4: U.S. Response and Future Outlook - The U.S. government has recognized the need to encourage the development of open-source AI models, as evidenced by the release of the AI Action Plan and new open-source initiatives from companies like OpenAI and the Allen Institute [12][13]. - The ATOM initiative aims to reclaim the U.S. leadership position in open-source models, emphasizing the importance of maintaining a competitive edge in the AI landscape [13].
展望2026,AI行业有哪些创新机会?
3 6 Ke· 2025-11-28 08:37
Core Insights - The AI industry is entering a rapid change cycle, with 2025 being a pivotal year for the development of large models, particularly with the emergence of DeepSeek, which is reshaping the global landscape and promoting open-source initiatives [1][10][18] - The dual-core driving force of AI development is characterized by the United States and China, each following distinct paths, with key technologies accelerating towards engineering applications [1][10][11] - Despite advancements in model capabilities, challenges in real-world application remain prevalent, indicating a shift in focus from "large models" to "AI+" [1][10][19] Group 1: Global Large Model Landscape - The global large model development is driven by a dual-core approach, with the U.S. leading in closed-source models and China focusing on open-source models [10][11][13] - OpenAI, Anthropic, and Google represent the leading trio in the large model arena, each adopting differentiated strategic paths [17] - DeepSeek's emergence marks a significant breakthrough for China's large model development, showcasing the potential of open-source models [18][19] Group 2: Key Technological Evolution - The evolution of large models is marked by four major technological trends: native multimodal integration, reasoning capabilities, long context memory, and agentic AI [22][24] - Native multimodal architectures are replacing text-centric models, allowing for seamless integration of various modalities [23] - Reasoning capabilities are becoming a core feature of advanced models, enabling them to demonstrate their thought processes [24][26] Group 3: Industry Chain and Infrastructure - The AI infrastructure is still dominated by Nvidia, with a slow transition towards a multi-polar ecosystem despite the emergence of alternatives like Google’s TPU and AMD’s chips [47][48] - The AI industry is shifting from reliance on a few cloud providers to a more collaborative funding model, with Nvidia and OpenAI acting as dual cores driving the ecosystem [51][52] Group 4: Application Layer Opportunities - Large model companies are positioning themselves as "super assistants" while also aiming to control user entry points through various products and services [53][54] - Independent application companies can find opportunities in vertical markets that require deep industry understanding and complex workflow integration [55][56] - The evolution of AI applications is moving towards intelligent agents capable of autonomous operation, indicating a significant shift in application development paradigms [61][62]