DeepSeek上新,“奥数金牌水平”

Core Insights - DeepSeek has released a new model, DeepSeek-Math-V2, which is the first open-source model to achieve International Mathematical Olympiad (IMO) gold medal level performance [3][5] - The model outperforms Google's Gemini DeepThink in certain benchmarks, showcasing its capabilities in mathematical reasoning [5][9] Performance Metrics - DeepSeek-Math-V2 achieved 83.3% in IMO 2025 and 73.8% in CMO 2024, while scoring 98.3% in the Putnam 2024 competition [4] - In the Basic benchmark, Math-V2 scored nearly 99%, significantly higher than Gemini DeepThink's 89%, but in the Advanced subset, Math-V2 scored 61.9%, slightly lower than Gemini's 65.7% [5] Research Implications - The paper titled "DeepSeek Math-V2: Towards Self-Validating Mathematical Reasoning" emphasizes the importance of rigorous mathematical proof processes rather than just correct answers [8] - DeepSeek advocates for self-validation in mathematical reasoning to enhance the development of more powerful AI systems [8] Industry Reactions - The release of Math-V2 has generated excitement in the industry, with comments highlighting its unexpected success over Google's model [9] - The competitive landscape is evolving, with other major players like OpenAI and Google releasing new models, raising anticipation for DeepSeek's next moves [10]