Workflow
Seek .(SKLTY)
icon
Search documents
GPT-5危了,DeepSeek开源世界首个奥数金牌AI,正面硬刚谷歌
3 6 Ke· 2025-11-28 01:55
Core Insights - DeepSeek has launched its new model, DeepSeekMath-V2, which has won the IMO 2025 gold medal, showcasing capabilities that rival or even surpass Google's IMO gold medal model [1][3][22] - This is the first open-source IMO gold medal model, marking a significant advancement in AI [1][24] Model Performance - DeepSeekMath-V2 demonstrated strong theorem-proving abilities, solving 5 out of 6 problems in the IMO 2025, achieving a gold medal level [3][4] - In the CMO 2024, it also reached gold medal status, and in the Putnam 2024, it scored 118 out of 120, surpassing the highest human score of 90 [3][4] Comparison with Competitors - DeepSeekMath-V2 outperformed Google's Gemini Deep Think in the ProofBench-Basic tests and closely followed it in the ProofBench-Advanced tests [5][22] - The model's performance indicates a significant leap in capabilities compared to existing models like OpenAI's GPT-5 and Gemini 2.5-Pro [26][28] Self-Verification Mechanism - A key breakthrough of DeepSeekMath-V2 is its self-verification capability, allowing it to self-assess and improve its proofs [12][36] - The model employs a unique "three-in-one" system consisting of a Generator, Verifier, and Meta-Verifier to enhance its proof quality [15][16] Training Methodology - The training process involved a high-compute search strategy, generating numerous candidate proofs and validating them rigorously [32][35] - The model's ability to self-correct and refine its proofs through multiple iterations significantly improved its performance [38] Implications for AI Development - The success of DeepSeekMath-V2 suggests a shift in AI from merely mimicking human responses to emulating human thought processes, emphasizing the importance of self-reflection in achieving advanced AI [36][37]
第1个获得数学奥赛金牌的开源模型!DeepSeek新模型获网友盛赞:公开技术文件,了不起!
Hua Er Jie Jian Wen· 2025-11-28 00:46
Core Insights - DeepSeek has launched its latest open-source mathematical reasoning model, DeepSeekMath-V2, which has achieved gold medal status in the highly competitive International Mathematical Olympiad (IMO) 2025, marking a significant breakthrough in open-source AI capabilities in complex reasoning [1][3]. Group 1: Model Performance - DeepSeekMath-V2 solved 5 out of 6 problems in the simulated IMO 2025, becoming the first open-source model to achieve gold medal status in such a prestigious competition [1]. - The model also demonstrated top-tier performance in other challenging mathematics competitions, including achieving gold medal status in the Chinese Mathematical Olympiad (CMO) and scoring 118 out of 120 in the Putnam Mathematics Competition 2024, surpassing the highest human score of 90 [3]. Group 2: Innovation in Training Framework - The model employs an innovative self-verification training framework, which includes a dedicated verifier that assesses the quality of the proof process rather than just the correctness of the final answer [2][11]. - To prevent overfitting, DeepSeek has implemented a dynamic evolution strategy that increases computational demands and automatically labels difficult proofs, ensuring that the verifier and generator evolve in sync [12]. Group 3: Open Source and Community Impact - DeepSeekMath-V2's weights are publicly available under the Apache 2.0 license, allowing researchers and developers to download and utilize the model freely, which is seen as a significant step towards the democratization of AI [2][4]. - The release has sparked discussions about the potential impact of open-source models on the commercial viability of closed-source products, particularly concerning major players like NVIDIA [2].
DeepSeek上新,“奥数金牌水平”
Di Yi Cai Jing· 2025-11-28 00:40
Core Insights - DeepSeek has released a new model, DeepSeek-Math-V2, which is the first open-source model to achieve International Mathematical Olympiad (IMO) gold medal level performance [3][5] - The model outperforms Google's Gemini DeepThink in certain benchmarks, showcasing its capabilities in mathematical reasoning [5][9] Performance Metrics - DeepSeek-Math-V2 achieved 83.3% in IMO 2025 and 73.8% in CMO 2024, while scoring 98.3% in the Putnam 2024 competition [4] - In the Basic benchmark, Math-V2 scored nearly 99%, significantly higher than Gemini DeepThink's 89%, but in the Advanced subset, Math-V2 scored 61.9%, slightly lower than Gemini's 65.7% [5] Research Implications - The paper titled "DeepSeek Math-V2: Towards Self-Validating Mathematical Reasoning" emphasizes the importance of rigorous mathematical proof processes rather than just correct answers [8] - DeepSeek advocates for self-validation in mathematical reasoning to enhance the development of more powerful AI systems [8] Industry Reactions - The release of Math-V2 has generated excitement in the industry, with comments highlighting its unexpected success over Google's model [9] - The competitive landscape is evolving, with other major players like OpenAI and Google releasing new models, raising anticipation for DeepSeek's next moves [10]
DeepSeek上新!首个奥数金牌水平的模型来了
Di Yi Cai Jing· 2025-11-28 00:22
Core Insights - DeepSeek has released a new model, DeepSeek-Math-V2, which is the first open-source model to achieve International Mathematical Olympiad (IMO) gold medal level performance [1] - The model outperforms Google's Gemini DeepThink in certain benchmarks, showcasing its capabilities in mathematical reasoning [1][5] Performance Metrics - DeepSeek-Math-V2 achieved 83.3% on IMO 2025 problems and 73.8% on CMO 2024 problems [4] - In the Putnam 2024 competition, it scored 98.3%, demonstrating exceptional performance [4] - On the Basic benchmark, Math-V2 scored nearly 99%, while Gemini DeepThink scored 89% [5] - In the Advanced subset, Math-V2 scored 61.9%, slightly below Gemini DeepThink's 65.7% [5] Research and Development Focus - The model emphasizes self-verification in mathematical reasoning, moving from a result-oriented approach to a process-oriented one [8] - DeepSeek aims to enhance the rigor and completeness of mathematical proofs, which is crucial for solving open problems [8] - The research indicates that self-verifying mathematical reasoning is a viable direction for developing more powerful AI systems [8] Industry Reaction - The release has generated significant interest, with comments highlighting DeepSeek's competitive edge over Google's model [9] - The industry is keenly awaiting further developments from DeepSeek, especially regarding their flagship model updates [10]
DeepSeek强势回归,开源IMO金牌级数学模型
3 6 Ke· 2025-11-27 23:34
Core Insights - DeepSeek has introduced a new model, DeepSeek-Math-V2, which aims to enhance self-verifiable mathematical reasoning capabilities in AI [1][2] - The model reportedly outperforms Gemini DeepThink, achieving gold medal-level performance in mathematical competitions [3] Model Development - DeepSeek-Math-V2 is based on the previous version, DeepSeek-Math-7b, which utilized 7 billion parameters to match the performance of GPT-4 and Gemini-Ultra [4] - The new model addresses limitations in current AI mathematical reasoning by focusing on the rigor of the reasoning process rather than just the accuracy of final answers [5][6] Self-Verification Mechanism - The model incorporates a self-verification system that includes a proof verification component, a meta-verification layer, and a self-evaluating generator [7][11] - The verification system is designed to assess the reasoning process in detail, providing feedback similar to human experts [8][10] Training and Evaluation - The training process involves a unique honest reward mechanism, where the model is incentivized to self-assess its performance and identify its own errors [11][15] - The model has demonstrated impressive results in various mathematical competitions, achieving high scores in IMO 2025, CMO 2024, and Putnam 2024 [16][17] Performance Metrics - In the IMO-ProofBench benchmark, DeepSeek-Math-V2 achieved nearly 99% accuracy in basic problems and performed competitively in advanced problems [18] - The model's dual improvement cycle between the verifier and generator significantly reduces the occurrence of hallucinations in large models [20] Future Implications - DeepSeek emphasizes that self-verifiable mathematical reasoning represents a promising research direction that could lead to the development of more powerful mathematical AI systems [20]
事关亿万参保人!国常会重磅部署;DeepSeek推出新模型|南财早新闻
Company Movements - Wahaha Group has completed a core personnel change, with Zong Fuli officially stepping down as legal representative, chairman, and general manager, succeeded by Xu Simin [8] - DeepSeek launched a new mathematical reasoning model, DeepSeekMath-V2, which utilizes a self-verifying training framework and continuously optimizes performance through high-difficulty samples [8] - After 12 years of listing, Joy City Property officially delisted from the Hong Kong Stock Exchange on November 27, due to its privatization plan [8] - Jinfutech announced that its existing business and Blue Origin Technology's business belong to different industries, presenting certain industry integration risks [8] - Toyota reported a 2.1% year-on-year increase in global sales for October, reaching 922,700 units, with a significant 26.4% increase in U.S. sales, while sales in China declined by 6.6% to 160,900 units [8] - Avita Technology (Chongqing) Co., Ltd. submitted a listing application to the Hong Kong Stock Exchange, with joint sponsors being CITIC Securities and CICC. The prospectus shows that Avita's revenue for the first half of this year was 12.208 billion yuan, a year-on-year increase of 98.52%, with vehicle sales revenue of 11.49 billion yuan [8] Investment News - On November 27, the A-share market experienced a pullback, with the Shanghai Composite Index closing up 0.29% at 3,875.26 points, while the Shenzhen Component Index and the ChiNext Index fell by 0.25% and 0.44%, respectively, with a market turnover of 1.72 trillion yuan [7] - Vanke's stock and bonds saw further declines, with "21 Vanke 02" closing down over 57%, "21 Vanke 06" down over 46%, and "22 Vanke 02" down over 42%, leading to temporary suspensions of six Vanke bonds due to significant declines. H-shares of Vanke fell nearly 8%, hitting a historical low, while Vanke A shares dropped over 7%, marking an 11-year low [7] - JPMorgan has upgraded its investment rating for the Chinese stock market to "overweight," suggesting a greater likelihood of substantial gains next year due to multiple supporting factors, including the implementation of AI applications, consumer stimulus measures, and governance reforms [7] - The Asset Management Association of China reported that by the end of October, the scale of private equity funds reached 22.05 trillion yuan, an increase of 1.31 trillion yuan from the end of September, setting a historical high. In October, 1,389 new private equity funds were registered, with a new registered scale of 67.01 billion yuan [7]
阿维塔“递表”港股IPO;DeepSeek推出新模型丨每经早参
Mei Ri Jing Ji Xin Wen· 2025-11-27 22:19
Group 1 - The third New Quality Productivity Automotive Conference will be held from November 28 to 30, 2025 [3] - Huawei's Mate80 and Mate80 Pro series will officially go on sale on November 28 [3] - The first batch of seven dual-innovation artificial intelligence ETFs will collectively launch on November 28 [3] Group 2 - The Hong Kong fire in Tai Po has resulted in 83 fatalities as of November 28 [6] - The Hong Kong government will provide emergency relief of 10,000 HKD per household affected by the fire [6] - A total of over 600 million HKD has been pledged in donations from various enterprises and organizations for disaster relief and recovery efforts [13][14] Group 3 - The Ministry of Commerce of China held a video conference with the German Federal Minister of Economics and Energy to discuss issues related to Nexperia [6] - The Chinese government is taking targeted measures to enhance credit repair, simplifying application materials and improving efficiency [9] Group 4 - Japan plans to issue approximately 11.7 trillion JPY (about 529.9 billion CNY) in government bonds to fund a new economic stimulus plan [11] - The former President of Peru, Pedro Castillo, has been sentenced to over 11 years in prison for conspiracy to commit rebellion [11] Group 5 - Anta Sports has responded to rumors regarding a potential bid for Puma, stating it does not comment on market speculation [20] - The leadership change at Wahaha Group may lead to strategic adjustments that could impact the competitive landscape [21] Group 6 - Joy City Property has officially delisted from the Hong Kong Stock Exchange after 12 years, following a privatization plan [23] - Avita Technology has submitted its IPO application to the Hong Kong Stock Exchange, marking a significant move for a state-owned enterprise in the new energy vehicle sector [27] Group 7 - The Chinese open-source AI model download share has surpassed that of the United States, indicating a significant advancement in AI technology [31]
“北溪”爆炸案一嫌疑人至德国受审;香港大埔火灾致83人遇难;外交部:中方绝不接受日方的自说自话;阿维塔“递表”港股IPO;DeepSeek推出新模型丨每经早参
Mei Ri Jing Ji Xin Wen· 2025-11-27 22:00
Group 1 - The Hong Kong fire in Tai Po has resulted in 83 fatalities, prompting the government to provide emergency relief funds of 10,000 HKD per household and establish a 300 million HKD aid fund [6][13][14] - Over 40 companies and organizations have pledged donations exceeding 600 million HKD for rescue and recovery efforts following the fire [13][14][15][16][17] Group 2 - The Chinese Ministry of Commerce held a video conference with Germany's Federal Minister of Economics to discuss issues related to semiconductor supply chains, emphasizing the need for constructive solutions to stabilize the global semiconductor market [5][8] - The National Development and Reform Commission announced measures to enhance credit repair, including simplifying application processes and improving efficiency [9] Group 3 - Anta Sports has been rumored to consider bidding for Puma, with potential collaboration with a private equity firm, reflecting ongoing industry merger and acquisition dynamics [21] - The resignation of Zong Fuli as chairman of Wahaha Group may lead to strategic adjustments within the company, impacting the competitive landscape of the industry [22] Group 4 - Joy City Property officially delisted from the Hong Kong Stock Exchange after 12 years, as part of a privatization plan valued at approximately 2.932 billion HKD [24] - Avita Technology has submitted an IPO application to the Hong Kong Stock Exchange, marking a significant move for a state-owned enterprise in the new energy vehicle sector [28] Group 5 - The release of the white paper on China's military control and disarmament reflects the country's commitment to global security governance and multilateral arms control processes [7] - The recent increase in open-source AI model downloads from China surpassing that of the US indicates a significant advancement in China's AI technology capabilities [32]
新突破!DeepSeek推出新模型
Core Insights - The DeepSeek team has developed a new model, DeepSeekMath-V2, which demonstrates significant advancements in mathematical reasoning capabilities, achieving gold medal levels in major competitions such as the IMO 2025 and CMO 2024, and scoring 118/120 in the Putnam 2024 [2][4][6]. Model Performance - DeepSeekMath-V2 achieved an 83.3% success rate in the IMO 2025 and a 73.8% success rate in the CMO 2024, while scoring 98.3% in the Putnam 2024 [3]. - In a self-constructed test of 91 CNML-level problems, DeepSeekMath-V2 outperformed both GPT-5-Thinking-High and Gemini 2.5-Pro across all categories including algebra, geometry, number theory, combinatorics, and inequalities [6]. Validation Mechanism - The model employs a self-driven verification-generation loop, utilizing one large language model (LLM) as a "reviewer" for proof validation and another as an "author" for proof generation, enhanced by a reinforcement learning mechanism [4]. - The introduction of a "meta-validation" layer aims to effectively suppress model hallucinations, addressing the critical issue of ensuring correct reasoning in mathematical tasks [4]. Benchmark Testing - In the IMO-ProofBench benchmark tests, DeepSeekMath-V2 outperformed DeepMind's DeepThink at the IMO gold medal level in basic sets and maintained strong competitiveness in more challenging advanced sets, significantly surpassing all other benchmark models [8]. Future Directions - The DeepSeek team acknowledges that while substantial work remains, the results indicate that self-verifying mathematical reasoning is a viable research direction, potentially leading to the development of more powerful mathematical AI systems [11].
重磅!DeepSeek推出DeepSeekMath‑V2模型
Mei Ri Jing Ji Xin Wen· 2025-11-27 14:46
Core Insights - DeepSeek launched a new mathematical reasoning model, DeepSeekMath-V2, on HuggingFace, featuring a self-verifying training framework [1] - The model is built on DeepSeek-V3.2-Exp-Base and utilizes an LLM verifier to automatically review generated mathematical proofs, continuously optimizing performance with high-difficulty samples [1] - Achievements include gold medal levels in IMO 2025 and CMO 2024, and a score of 118/120 in Putnam 2024, validating the feasibility of self-verifying reasoning paths [1] - The model's code and weights have been open-sourced and are available on Hugging Face and GitHub [1]