DeepSeek又上新！模型硬刚谷歌

Core Viewpoint - DeepSeek has launched two new models, DeepSeek-V3.2 and DeepSeek-V3.2-Speciale, which are leading in reasoning capabilities globally [3]. Model Overview - DeepSeek-V3.2 aims to balance reasoning ability and output length, suitable for everyday use such as Q&A and general intelligence tasks. It has reached the level of GPT-5 in public reasoning tests, slightly below Google's Gemini3 Pro [5]. - DeepSeek-V3.2-Speciale is designed to push the reasoning capabilities of open-source models to the extreme, combining long-thinking enhancements and theorem-proving abilities from DeepSeek-Math-V2 [5]. Performance Metrics - Speciale has surpassed Google's Gemini3 Pro in several reasoning benchmark tests, including the American Mathematics Invitational, Harvard MIT Mathematics Competition, and International Mathematical Olympiad [6]. - In the AIME 2025 benchmark, Speciale scored 96.0, while Gemini-3.0 scored 95.0 [7]. - Speciale achieved gold medals in IMO, ICPC World Finals, and IOI, with ICPC and IOI scores reaching the levels of the second and tenth human competitors, respectively [6]. Limitations and Future Plans - DeepSeek acknowledges limitations compared to proprietary models like Gemini3 Pro, including a narrower breadth of world knowledge and lower token efficiency [8]. - The company plans to increase pre-training computational resources and optimize model reasoning chains to improve efficiency and fill knowledge gaps [8]. Industry Context - The gap between open-source and closed-source models is widening, with proprietary systems showing stronger performance in complex tasks [10]. - DeepSeek has introduced a sparse attention mechanism (DSA) to reduce computational complexity without sacrificing long-context performance, which has been effective in improving model performance [11]. Community Reception - The release of DeepSeek's models has been positively received in overseas social media, with comments highlighting the achievement of matching GPT-5 and Gemini3 Pro with an open-source model [11].