Seek .-ChatGPT 三周年遭 DeepSeek 暴击，23 页技术报告藏着开源登顶的全部秘密

Core Insights - DeepSeek has launched two new models, DeepSeek-V3.2 and DeepSeek-V3.2-Speciale, which aim to enhance AI's reasoning capabilities and tool usage, rivaling models like GPT-5 and Gemini-3.0-Pro [1][5][11]. Model Features - DeepSeek-V3.2 focuses on cost-effectiveness and everyday use, achieving reasoning capabilities comparable to GPT-5, while DeepSeek-V3.2-Speciale targets high-performance tasks, matching Gemini-3.0-Pro [5][11]. - Both models utilize a new sparse attention mechanism (DSA) to improve processing speed and efficiency, particularly for long documents, by focusing only on relevant parts of the text [4][7]. Training Innovations - DeepSeek has invested over 10% of its pre-training budget into post-training resources, enhancing model stability and scalability through a robust reinforcement learning framework [8][10]. - The training process includes "expert distillation" to create specialized models in various domains, which are then used to generate training data for the final model [10][11]. Performance Metrics - In benchmark tests, DeepSeek-V3.2 has shown competitive performance with GPT-5 and Kimi-K2-Thinking across multiple metrics, while the Speciale version has outperformed Gemini-3.0-Pro in specific tasks [20][22][24]. - The models have achieved notable results in prestigious competitions, with Speciale ranking 2nd in ICPC and 10th in IOI, demonstrating high-level reasoning and problem-solving capabilities [25][26]. Self-Training Mechanism - DeepSeek has developed a self-training pipeline with over 18,000 tasks, allowing AI to autonomously generate, validate, and improve its own training data, enhancing its reasoning abilities [17][19]. - This approach shifts the paradigm from human-led training to AI-driven self-improvement, fostering a new level of model evolution [19][32]. Future Directions - Despite the advancements, DeepSeek acknowledges that V3.2 still has gaps compared to top proprietary models, particularly in knowledge coverage and token efficiency, indicating plans for future enhancements [30][32].