新突破!DeepSeek推出新模型
新华网财经·2025-11-28 01:15

Core Insights - DeepSeek launched a new mathematical reasoning model, DeepSeekMath-V2, on HuggingFace, which utilizes a self-verifying training framework [2] - The model is built on DeepSeek-V3.2-Exp-Base and employs an LLM verifier to automatically review generated mathematical proofs, continuously optimizing performance with high-difficulty samples [3] - DeepSeekMath-V2 achieved gold medal levels in both the 2025 International Mathematical Olympiad (IMO) and the 2024 Chinese Mathematical Olympiad (CMO), and scored 118/120 in the 2024 Putnam Mathematical Competition [3][4] Performance Metrics - In the IMO 2025, the model scored 83.3% across problems P1 to P5 [4] - In the CMO 2024, it achieved 73.8% on problems P1, P2, P4, P5, and P6 [4] - In the Putnam 2024, it scored 98.3% on problems A1 to B6 [4] Model Architecture - The core architecture of DeepSeekMath-V2 establishes a self-driven verification-generation loop, with one LLM acting as a "reviewer" for proof verification and another as a "creator" for proof generation, utilizing reinforcement learning for collaboration [5] - A "meta-verification" layer is introduced to effectively suppress model hallucinations [5] Competitive Edge - In a self-constructed test of 91 CNML-level problems, DeepSeekMath-V2 demonstrated superior mathematical reasoning capabilities, outperforming GPT-5-Thinking-High and Gemini 2.5-Pro across all categories including algebra, geometry, number theory, combinatorics, and inequalities [7] - The model also excelled in the IMO-ProofBench benchmark, surpassing DeepMind's DeepThink at the IMO gold medal level in basic sets and maintaining strong competitiveness in more challenging advanced sets [8] Future Directions - The DeepSeek team indicates that while significant work remains, these results suggest that self-verifying mathematical reasoning is a viable research direction, potentially aiding in the development of more powerful mathematical AI systems [10]