Core Viewpoint - DeepSeek has released an open-source model, DeepSeek-Math-V2, which is the first model to achieve IMO gold medal level in mathematics and outperforms Google's Gemini DeepThink in certain benchmarks [3][5]. Group 1: Model Performance - DeepSeek-Math-V2 achieved nearly 99% on the Basic benchmark, significantly outperforming Gemini DeepThink, which scored 89% [5]. - In the more challenging Advanced subset, Math-V2 scored 61.9%, slightly below Gemini DeepThink's 65.7% [5]. - The model has demonstrated gold medal-level performance in IMO 2025 and CMO 2024, and nearly perfect scores in the Putnam 2024 exam (118/120) [8]. Group 2: Research and Development Insights - DeepSeek emphasizes the importance of verifying mathematical reasoning comprehensively and rigorously, moving from a result-oriented approach to a process-oriented one [8]. - The model is designed to teach AI to review proof processes like a mathematician, enhancing its ability to solve complex mathematical proofs without human intervention [8]. Group 3: Industry Reactions and Expectations - The release of Math-V2 has generated excitement in the industry, with reactions noting that DeepSeek has surpassed expectations by defeating Google's IMO Gold model by a 10% margin [9]. - There is anticipation regarding DeepSeek's next moves, especially concerning updates to its flagship models, as the industry awaits further developments [9].
DeepSeek上新,“奥数金牌水平”
第一财经·2025-11-28 00:35