DeepSeek
Search documents
DeepSeek又上新!模型硬刚谷歌 承认开源与闭源差距拉大
Di Yi Cai Jing· 2025-12-01 23:13
【相关阅读】 开源最强!拳打GPT 5,脚踢Gemini-3.0,DeepSeek V3.2为何提升这么多? 梁文锋署名论文,DeepSeek最强开源Agent模型炸场 DeepSeek发布最强开源新品,瞄向全能Agent,给GPT-5与Gemini 3下战书 来源:第一财经 12月1日晚,DeepSeek又上新了两款新模型,DeepSeek-V3.2和DeepSeek-V3.2-Speciale,在推理能力上全球领先。 两款模型有着不同的定位。DeepSeek-V3.2的目标是平衡推理能力与输出长度,适合日常使用,例如问答场景和通用智能体任务场景。9月底DeepSeek发布了 实验版V3.2-Exp,此次是正式版更新。在公开推理测试中,V3.2达到了GPT-5的水平,仅略低于谷歌的Gemini3 Pro。 DeepSeek-V3.2-Speciale则是此次的重头戏,其目标是"将开源模型的推理能力推向极致,探索模型能力的边界"。据介绍,Speciale是V3.2的长思考增强版, 同时结合了DeepSeek-Math-V2的定理证明能力,该模型具备出色的指令跟随、严谨的数学证明与逻辑验证能力。 据DeepSe ...
【早报】热门中概股逆势飘红,银价续创历史新高;DeepSeek发布两款新模型
财联社· 2025-12-01 23:08
早 报 精 选 宏 观 新 闻 行 业 新 闻 1、 应国家主席习近平邀请,法国总统马克龙将于12月3日至5日对中国进行国事访问。 2、中国公民2026年9月14日前可免签前往俄罗斯。 3、DeepSeek发布两款新模型。 4、宁德时代1-6职级员工基本工资上调150元。 5、现货白银收涨2.85%,报57.987美元/盎司,继续刷新历史新高。 1、 外交部发言人宣布,应国家主席习近平邀请,法国总统马克龙将于12月3日至5日对中国进行国事访问。 2、外交部发言人林剑12月1日表示,日本在口头上搪塞敷衍,在行动上一意孤行,中方对此绝不接受。"在大是大非问题上,日本不 要妄想蒙混过关。我们敦促日方以史为鉴,深刻反省,严肃对待中方要求,老老实实收回错误言论,以实际行动体现对中方的政治承 诺。" 3、当地时间12月1日获悉,根据俄罗斯总统普京当天签署的命令,中国公民至2026年9月14日(含14日)前可免签证以旅游和商务 目的前往俄罗斯。免签天数为30天。 4、美联储表示,出于对"利率高企、承保标准收紧及商业地产价值下降"的担忧正密切关注社区和地区性银行的相关投资组合,因这 些因素可能影响借款人再融资或还清贷款的能力 ...
开源最强!“拳打GPT 5”,“脚踢Gemini-3.0”,DeepSeek V3.2为何提升这么多?
美股IPO· 2025-12-01 22:29
V3.2在工具调用能力上达到当前开源模型最高水平,大幅缩小了开源模型与闭源模型的差距。作为DeepSeek首个将思考融入工具使用的模型,V3.2 在"思考模式"下仍然支持工具调用。公司通过大规模Agent训练数据合成方法,构造了1800多个环境、85000多条复杂指令的强化学习任务,大幅提升 了模型在智能体评测中的表现。 在大模型赛道逐渐从"参数竞赛"走向"能力竞赛"的当下,一个显著的变化正在发生:开源模型开始在越来越多关键能力维度上逼近、甚至冲击顶级闭源 模型。 12月1日,DeepSeek同步发布两款正式版模型—— DeepSeek-V3.2 与 DeepSeek-V3.2-Speciale ,前者在推理测试中达到GPT-5水平,仅略低于 Gemini-3.0-Pro,而后者在IMO 2025等四项国际顶级竞赛中斩获金牌。 V3.2在工具调用能力上达到当前开源模型最高水平,大幅缩小了开源模型与闭源模型的差距。 据官方介绍, V3.2是DeepSeek首个将思考融入工具使用的模型,在"思考模式"下仍然支持工具调用。该公司通过大规模Agent训练数据合成方法,构 造了1800多个环境、85000多条复杂指令的 ...
腾讯研究院AI速递 20251202
腾讯研究院· 2025-12-01 16:03
Group 1: Generative AI Developments - DeepSeek has officially released versions V3.2 and V3.2-Speciale, with V3.2 achieving reasoning capabilities at GPT-5 level and significantly reduced output length suitable for daily use and general agent tasks [1] - V3.2-Speciale is an enhanced version for long reasoning, successfully winning gold medals in IMO 2025, CMO 2025, ICPC, and IOI 2025 by integrating theorem proving capabilities [1] - The new versions incorporate thinking into tool calls, constructing over 1,800 environments and 85,000 complex instructions through large-scale agent training data synthesis, greatly enhancing generalization capabilities [1] Group 2: Image Generation Technology - Vidu has launched the Vidu Q2 image generation suite, with upgraded features including text-to-image and image editing capabilities, producing results in as fast as 5 seconds and ranking in the top four of the global image editing leaderboard [2] - The Q2 suite allows for location referencing, action replication, instruction following, and scene switching while maintaining high consistency, supporting 4K output and arbitrary aspect ratio generation [2] - Memberships are available for free until December 31, with standard and professional members receiving a monthly limit of 300 images, while flagship members enjoy unlimited generation privileges [2] Group 3: ByteDance's New Assistant - ByteDance has released a preview version of the Doubao mobile assistant, aimed at smartphone manufacturers, capable of executing complex operations across applications such as price comparison for food delivery and auto-replying to messages [3] - The assistant features a dedicated physical button and voice activation, with screen awareness capabilities to automatically read chat context and generate replies [3] - ByteDance is in talks with multiple smartphone manufacturers, with a device featuring the Doubao assistant already launched at a price of 3,499 yuan [3] Group 4: Advertising in AI Applications - Developers discovered multiple advertising-related references in the ChatGPT Android app's beta code, including terms like "ads feature" and "search ads carousel" [4] - OpenAI's stance on advertising has shifted three times in a year, from viewing it as a "last resort" to a more accepting attitude [4] - HSBC estimates that OpenAI's operational costs for maintaining computational infrastructure could reach several hundred billion dollars annually, predicting continued losses exceeding 100 billion dollars by 2029 [4] Group 5: AI in Mathematics - The AI mathematician "Aristotle," developed by HarmonicMath, independently solved a simplified version of the Erdős problem 124 in just 6 hours, with verification in the Lean proof system taking only 1 minute [5][6] - This AI combines reinforcement learning, Monte Carlo tree search, and Lean formal language to explore millions of proof strategies, outputting 100% verifiable theorems, outperforming ChatGPT and Gemini [6] - Mathematician Terence Tao noted that AI is currently addressing the "low-hanging fruit" in mathematics, allowing human mathematicians to focus on more significant challenges [6] Group 6: Automation and Workforce Impact - A McKinsey report indicates that existing technology could theoretically automate 57% of work hours in the U.S., with agents taking 44% and robots handling 13% [7] - The report categorizes jobs into seven archetypes, predicting that 25% to 33% of the most sought-after skills will be automated in the future [7] - By 2030, redesigning workflows to allow agents to handle cognitive tasks and robots to manage physical tasks could release approximately 2.9 trillion dollars in economic value annually in the U.S. [7] Group 7: AI Companies' Pricing Strategies - Stripe's analysis reveals that about 80% of the top 10% fastest-growing AI companies utilize tiered pricing, with a likelihood of usage-based pricing nearly double that of other companies [8] - High-growth companies often offer at least 10 SKU product units, actively expanding into global markets and supporting local currency transactions to enhance conversion rates [8] - These companies are quick to respond to market demand changes, offering situational discounts and flexibly adjusting monetization models and pricing strategies based on user preferences [8] Group 8: Evolution of AI Technology - Since its launch on December 1, 2022, ChatGPT has evolved from an initial phase of wonder and hallucination to a period of multimodal capabilities and application explosion, significantly altering human production relationships [9] - The release of Google's Gemini 3 has shifted the competitive landscape, with Gemini's mobile app monthly active users increasing from 400 million to 650 million, surpassing ChatGPT in user engagement [9] - OpenAI's partners are shouldering nearly 100 billion dollars in debt, while OpenAI itself reportedly has minimal liabilities [9]
X @Bloomberg
Bloomberg· 2025-12-01 15:12
China’s DeepSeek unveiled two new versions of an experimental AI model it released weeks ago, adding fresh capabilities the startup said would help with combining reasoning and executing certain actions autonomously https://t.co/I1LvHWK1NX ...
DeepSeek 重大发布
Zheng Quan Shi Bao· 2025-12-01 15:04
Core Insights - DeepSeek has released two official model versions: DeepSeek-V3.2 and DeepSeek-V3.2-Speciale, with the former available on the official website, app, and API, while the latter is currently accessible only as a temporary API for community evaluation [1][3]. Model Performance - DeepSeek-V3.2 aims to balance reasoning capability and output length, making it suitable for daily use. In benchmark tests, it achieved performance comparable to GPT-5 and slightly below Gemini-3.0-Pro, with a significant reduction in output length compared to Kimi-K2-Thinking, leading to lower computational costs and reduced user wait times [3][4]. - DeepSeek-V3.2-Speciale is designed to push the limits of reasoning capabilities, serving as an enhanced version of DeepSeek-V3.2, and incorporates theorem-proving abilities from DeepSeek-Math-V2. It performed comparably to Gemini-3.0-Pro in mainstream reasoning benchmarks and won gold medals in several prestigious competitions, including IMO 2025 and ICPC World Finals 2025, achieving second and tenth place among human competitors, respectively [3][4]. Benchmark Comparisons - In various benchmark tests, DeepSeek-V3.2 and DeepSeek-V3.2-Speciale demonstrated competitive performance: - AIME 2025: DeepSeek-V3.2 scored 93.1, while DeepSeek-V3.2-Speciale scored 96.0 [4]. - HMMT Feb 2025: DeepSeek-V3.2 scored 92.5, and DeepSeek-V3.2-Speciale scored 99.2 [4]. - IMOAnswerBench: DeepSeek-V3.2 scored 78.3, and DeepSeek-V3.2-Speciale scored 84.5 [4]. - CodeForces: DeepSeek-V3.2 scored 2386, while DeepSeek-V3.2-Speciale scored 2701 [4]. Cost Efficiency - The introduction of DeepSeek-V3.2-Exp, based on V3.1-Terminus with a new attention mechanism (DSA), has led to significant improvements in training and reasoning efficiency, resulting in a notable reduction in model costs. This cost reduction enhances the model's cost-effectiveness and potential for broader application [4].
DeepSeek 上新
Zhong Guo Zheng Quan Bao· 2025-12-01 15:04
12月1日,DeepSeek微信公众号宣布,今日发布两个正式版模型:DeepSeek-V3.2和DeepSeek-V3.2-Speciale。 DeepSeek-V3.2与其他模型在各类数学、代码与通用领域评测集上的得分(括号内为消耗Tokens总量约数) 图片来源:DeepSeek微信公众号 从数据来看,在高度复杂任务上,Speciale模型大幅优于标准版本,但消耗的Tokens也显著更多,成本更高。DeepSeek表示,目前DeepSeek-V3.2-Speciale 仅供研究使用,不支持工具调用,暂未针对日常对话与写作任务进行专项优化。 在使用上,不同于过往版本在思考模式下无法调用工具的局限,DeepSeek-V3.2是DeepSeek推出的首个将思考融入工具使用的模型,并且同时支持思考模 式与非思考模式的工具调用。公司通过提出一种大规模Agent训练数据合成方法,构造大量难解答、易验证的强化学习任务,提高模型的泛化能力。 公司表示,DeepSeek-V3.2思考模式增加了对Claude Code的支持,但未充分适配Cline、RooCode等使用非标准工具调用的组件,因此建议用户在使用此类 组件时继续 ...
DeepSeek发布最强开源新品,瞄向全能Agent,给GPT-5与Gemini 3下战书
Tai Mei Ti A P P· 2025-12-01 15:03
Core Insights - DeepSeek has launched two new models, DeepSeek-V3.2 and DeepSeek-V3.2-Speciale, marking a significant advancement in AI capabilities, particularly in reasoning and output efficiency [2][3] - The V3.2 model is positioned as the strongest open-source large model, outperforming competitors in various benchmarks while significantly reducing output length and computational costs [3][4] - The V3.2 model integrates a new sparse attention mechanism (DSA) to enhance performance in long-context scenarios, while also improving the model's ability to follow instructions and generalize in complex environments [8][9] Model Performance - In benchmark tests, DeepSeek-V3.2 achieved competitive scores against models like GPT-5, Claude 4.5, and Gemini 3 Pro, with notable strengths in specific areas [4][5] - The V3.2 model demonstrated superior performance in question-and-answer scenarios, providing detailed and accurate travel recommendations through advanced tool usage [5][6] - The V3.2 Speciale model focuses on maximizing reasoning capabilities, achieving results comparable to Gemini 3.0 Pro in mainstream reasoning benchmarks, although it requires a higher token cost and is not designed for everyday use [9][10] Development Focus - DeepSeek emphasizes practical usability and generalization in its models, aiming to overcome common pitfalls in AI interactions, such as making basic common-sense errors [6][8] - The company is committed to enhancing the reasoning abilities of its models, as evidenced by the integration of advanced mathematical reasoning capabilities from the recently released DeepSeek-Math-V2 [9][10] - The competitive landscape for large models is intensifying, with major players like GPT-5 and Gemini 3 pushing the boundaries of AI capabilities, suggesting a dynamic future for AI development [10]
DeepSeek发布V3.2正式版
Xin Jing Bao· 2025-12-01 15:01
Core Insights - DeepSeek announced the release of two official model versions: DeepSeek-V3.2 and DeepSeek-V3.2-Speciale [1] Model Overview - DeepSeek-V3.2 aims to balance reasoning capability and output length, making it suitable for everyday use, such as Q&A scenarios and general agent tasks [1] - In benchmark tests for reasoning, DeepSeek-V3.2 achieved performance comparable to GPT-5, slightly below Gemini-3.0-Pro [1] - Compared to Kimi-K2-Thinking, V3.2 significantly reduced output length, leading to lower computational costs and reduced user wait times [1] Special Features - DeepSeek-V3.2-Speciale is designed to push the reasoning capabilities of open-source models to the limit, exploring the boundaries of model performance [1] - This version is an enhanced long-thinking variant of DeepSeek-V3.2, incorporating theorem-proving capabilities from DeepSeek-Math-V2 [1] - The model exhibits excellent instruction-following, rigorous mathematical proof, and logical verification abilities, performing comparably to Gemini-3.0-Pro in mainstream reasoning benchmark tests [1]
DeepSeek,上新
Zhong Guo Zheng Quan Bao· 2025-12-01 14:48
Core Insights - DeepSeek has released two new models: DeepSeek-V3.2 and DeepSeek-V3.2-Speciale, aimed at enhancing reasoning capabilities and output length for various applications [1][2]. Model Performance - DeepSeek-V3.2 achieved performance comparable to GPT-5 and slightly below Gemini-3.0-Pro in public reasoning benchmarks, while significantly reducing output length compared to Kimi-K2-Thinking, thus lowering computational costs and user wait times [1][3]. - The DeepSeek-V3.2-Speciale model demonstrated exceptional instruction-following, rigorous mathematical proof, and logical validation capabilities, achieving gold medal-level performance in major competitions such as IMO 2025 and ICPC World Finals 2025 [2][3]. Benchmark Comparisons - In various benchmark tests, DeepSeek-V3.2-Speciale outperformed standard versions and other models, with notable scores in AIME 2025 (96.0) and HMMT Feb 2025 (99.2), while also achieving high rankings in IMOAnswerBench and LiveCodeBench [3]. - The performance of DeepSeek-V3.2-Speciale in complex tasks was significantly better than the standard version, although it required more tokens, indicating higher operational costs [3]. Model Features - DeepSeek-V3.2 is the first model to integrate reasoning with tool usage, supporting both reasoning and non-reasoning modes for tool invocation, enhancing its versatility [4]. - The model has improved generalization capabilities through a novel large-scale agent training data synthesis method, allowing it to perform well in real-world applications [4].