Workflow
Gemini 2.5 Deep Think
icon
Search documents
半世纪难题48小时破解!陶哲轩组队把AI数学玩成打怪游戏了
量子位· 2025-12-13 04:34
西风 鹭羽 发自 凹非寺 量子位 | 公众号 QbitAI 刚刚,陶哲轩与多名数学家通力合作,为Erdős #1026正式画上了句号 。 至此,这个尘封50年的难题终于得到完全解决。 关键是,AI又立大功了。在多种AI工具的辅助下,整个解题流程仅用 48小时 便完成。 博采众家&AI之长,正在成为解决问题的关键。 正如陶哲轩本人所说: 用传统方法,一两位数学家用简单的编程和文献检索工具,最终也能完成, 但可能需要数周或者数月才能解决 。 陶哲轩随后亲自梳理并公开了此次问题被解决的完整过程。 消息传出后,网友纷纷感叹"太酷了": 一起来看看他们究竟是如何解决的? 48小时解决Erdős #1026 Erdős #1026 问题最早在1975年被提出,初始问题为: 但该问题表述相当模糊,于是数学家Desmond Weisenberg提议对这个函数的最小可能值进行研究,引入一个最大常数的量c(n),使得: $$S(x_{1},\ldots,x_{n})\geq c(n)\sum_{i=1}^{n}x_{i}$$ 其中c(n)是所有长度为n的不同实数序列。 如果用 博弈论 来解释该问题,那么就是: 假设Alice有N ...
陶哲轩亲测:我用Gemini十分钟搞定了困扰学界多年的难题
量子位· 2025-11-24 07:30
鹭羽 发自 凹非寺 量子位 | 公众号 QbitAI 陶哲轩携手Gemini, 十分钟 破解数学难题! 还是他本人擅长的连续整数结构的乘法数论问题。 该问题建立在前人并不完整的反证基础上,陶哲轩借助 Gemini Deep Think 进行证明补全。 全程只用了十分钟,Gemini就从论证分析打通了结论确认。 下面来看详细验证过程: 而这也并非孤例,陶哲轩发现,在Erdős问题网站上,类似的情况时有发生,最近陆陆续续有6个困扰数学界多年的Erdős难题通过AI辅助方 法,得以解决。 还有许多研究者正在系统性地使用AI工具进行相关文献资料的查询,并作为解题思路留在评论区。 而这次陶哲轩使用的模型Gemini 2.5 Deep Think,想必大家也并不陌生。 正是此前的IMO金牌得主,在最新的FrontierMath测试中,其数学能力也远超 GPT-5(high) 等模型。 Gemini用十分钟完成验证 首先回到问题本身,这是由Paul Erdős提出的 # 367号 问题: 该问题设定 为整数n的2-full部分,即 ,其中 为 中幂为1的素因子之积。 简单来说,就是 会把 中所有只出现一次的素因子去掉,只保 ...
十分钟出结果,陶哲轩用Gemini Deepthink帮人类数学家完成Erdős问题论证
机器之心· 2025-11-23 04:06
Core Viewpoint - The article discusses the Erdős Problems website, which focuses on mathematical research and problem-solving, particularly related to the famous mathematician Paul Erdős. It serves as a platform for researchers and enthusiasts to propose, discuss, and solve various mathematical problems across different fields such as number theory, combinatorics, and graph theory [1]. Group 1 - The Erdős Problems website collects various mathematical problems proposed by Erdős, covering diverse areas like number theory, combinatorics, and graph theory [1]. - Independent researcher Wouter van Doorn provided a counterexample to Erdős Problem 367, relying on a congruence identity he believes to be valid [5]. - The problem was later submitted to Gemini 2.5 Deep Think by renowned mathematician Terence Tao, who received a complete proof from the AI in about ten minutes [9]. Group 2 - Terence Tao manually converted the AI-generated proof into a more basic form within half an hour, indicating that the proof could be formalized and verified in Lean [11]. - Two days later, mathematician Boris Alexeev used the Harmonic Aristotle tool to complete the Lean formalization of the problem, taking two to three hours for the process [12]. - Terence Tao has been exploring the application of AI tools in mathematics, contributing to various research and proofs, including a recent paper on the topic [13].
陶哲轩亲测,GPT-5 Pro 40分钟破解3年难题,登顶最难数学考试
3 6 Ke· 2025-10-13 00:31
Core Insights - The article discusses the capabilities and limitations of AI, specifically GPT-5 Pro, in solving complex mathematical problems, highlighting the distinction between computational ability and true understanding [1][2][34]. Group 1: AI Performance in Mathematics - GPT-5 Pro achieved a score of 13% on the challenging FrontierMath test set, indicating strong computational skills but limited understanding of deeper mathematical concepts [2][32]. - The AI demonstrated proficiency in handling structured and symbolic problems but struggled with geometric constructions and problems requiring intuition [40][41]. Group 2: Real-World Testing by Mathematician - Mathematician Terence Tao tested GPT-5 Pro with an unsolved problem in differential geometry, seeking to explore the AI's ability to generate new ideas in unfamiliar areas [5][6][7]. - The AI successfully generated a reasoning chain for simpler cases but failed to maintain accuracy when the problem was slightly altered, revealing its tendency to reinforce incorrect paths [14][15]. Group 3: Insights Gained from AI Interaction - Tao noted that the AI's performance helped him understand the problem better, not because it solved it, but because it illuminated the reasons for its failure [16][17]. - The experiment highlighted the importance of human intuition and situational awareness in research, suggesting that while AI can assist in calculations, it lacks the ability to grasp the broader context [44][45]. Group 4: Implications for Future Research - The article emphasizes the need for a balance between automation and human oversight in research, as excessive reliance on AI could lead to a decline in critical thinking and understanding [38][39]. - The distinction between AI's linear intelligence and human's topological understanding suggests a new division of labor in mathematics, where AI serves as a computational engine while humans focus on structural design and meaning [45][46].
谷歌与OpenAI同获ICPC 2025金牌!GPT-5满分夺冠,Gemini攻破人类队伍都没解出的难题
AI科技大本营· 2025-09-19 10:36
Core Viewpoint - The participation of AI models GPT-5 and Gemini 2.5 Deep Think in the International Collegiate Programming Contest (ICPC) marks a significant milestone, showcasing their ability to compete at a level comparable to top human teams in a highly challenging algorithmic competition [1][7]. Summary by Sections ICPC Overview - The ICPC is recognized as the "Olympics" of computer programming, gathering top algorithmic talents from universities worldwide since the 1970s [5]. - This year's finals featured teams from 103 countries and 139 universities, with each team consisting of three students tasked with solving 12 algorithmic problems in 5 hours [5]. AI Performance - GPT-5 achieved a perfect score by solving all 12 problems, while Gemini 2.5 Deep Think solved 10 out of 12 within 677 minutes, both reaching gold medal standards [2][8]. - Notably, no human team managed to solve all problems, with the best human team solving 11 out of 12 [2][8]. Significance of AI Participation - The entry of AI into ICPC is particularly noteworthy as it places AI in one of the most rigorous algorithmic competitions, demonstrating its advanced capabilities [7]. - GPT-5's performance included solving 11 problems on the first attempt, with the final problem solved on the ninth submission, highlighting its efficiency [9]. Unique Problem Solving - Gemini 2.5 Deep Think's approach to a complex problem involving a network of reservoirs showcased its innovative algorithmic thinking, which was not based on standard solutions [12]. - The problem required finding an optimal configuration for filling reservoirs in the shortest time, demonstrating Gemini's ability to create original solutions rather than relying solely on memorized data [12]. Broader Implications - The success of GPT-5 and Gemini 2.5 Deep Think in ICPC indicates that AI has developed capabilities for on-the-spot reasoning, abstract modeling, and creative problem-solving, surpassing previous concerns about AI merely memorizing training data [14]. - This event is seen as a pivotal moment in the evolution of AI, suggesting that AI can now compete directly with human intelligence in complex problem-solving scenarios [14].
OpenAI在ICPC 2025编程赛上满分登顶,Gemini也达到金牌水平
3 6 Ke· 2025-09-18 09:50
Core Insights - OpenAI and Gemini both achieved gold medal levels at the ICPC 2025, with OpenAI solving all 12 problems in 5 hours, outperforming all human teams [1][6] - Gemini solved 10 out of 12 problems in 677 minutes, ranking second among human teams [3][20] Group 1: Competition Overview - The ICPC World Finals took place on September 4 in Baku, Azerbaijan, featuring top teams from early competition stages [6] - A total of 139 teams participated, with only the top four teams receiving gold medals based on perfect solutions and time efficiency [6] Group 2: Performance Comparison - The top human team, from St. Petersburg State University, solved 11 problems in 1478 minutes, while OpenAI solved all 12 in 300 minutes [5][7] - Gemini's performance included solving 8 problems in 45 minutes and the remaining 2 in the following 3 hours [20] Group 3: AI Capabilities - OpenAI's AI system, comprising a general reasoning model, solved 11 problems accurately on the first attempt, with the final problem requiring 9 attempts [12][7] - Gemini utilized advanced data structures and algorithms to solve problems, demonstrating its capability in complex reasoning tasks [20][28] Group 4: Implications for AI - The success of AI in ICPC highlights its potential to provide innovative solutions and assist in complex reasoning, marking a shift from mere information processing to problem-solving capabilities [35]
刚刚,OpenAI在ICPC 2025编程赛上满分登顶,Gemini也达到金牌水平
机器之心· 2025-09-18 04:32
Core Insights - OpenAI and Gemini have both achieved gold medal levels in the ICPC 2025 competition, showcasing significant advancements in AI capabilities in competitive programming [1][26][46] Group 1: OpenAI's Performance - OpenAI solved all 12 problems in 5 hours, outperforming all human teams and achieving the highest rank [1][10] - The AI system submitted correct answers for 11 problems on the first attempt, with the most challenging problem solved after 9 attempts [10][11] - OpenAI's participation utilized a "general reasoning model ensemble" without any specific optimizations for the ICPC competition [15] Group 2: Gemini's Performance - Gemini solved 10 out of 12 problems in 677 minutes, ranking second among human teams [3][28] - The AI began its competition 10 minutes late but still achieved gold-level performance [28] - Gemini demonstrated advanced problem-solving capabilities, including solving a problem that no human team could [33][38] Group 3: Competition Context - The ICPC is recognized as the largest and most prestigious university-level programming competition, attracting participants from nearly 3,000 universities across 103 countries [6][46] - The competition emphasizes the importance of perfect solutions and time management, with only the top four teams receiving gold medals [6][46] Group 4: Implications for AI - The success of AI in the ICPC highlights its potential to provide innovative solutions and complement human expertise in complex problem-solving scenarios [46] - AI is transitioning from a mere information processing tool to a key player in assisting with intricate reasoning tasks [46]
ICPC总决赛被AI统治,GPT-5组合系统12题全对登顶,人类打破头只能争夺第三
3 6 Ke· 2025-09-18 01:56
Core Insights - The 2025 International Collegiate Programming Contest (ICPC) World Finals showcased the impressive capabilities of AI systems, with OpenAI's model solving all 12 problems and ranking first if included in the competition [1][6] - Google's Gemini 2.5 Deep Think model achieved gold-level performance by solving 10 out of 12 problems, ranking second [1][12] - The event featured 139 top teams from nearly 3,000 universities across 103 countries, highlighting the global competitiveness of the contest [3] AI Performance - OpenAI's system, a combination of GPT-5 and an experimental reasoning model, solved all problems within five hours, with GPT-5 independently completing the first 11 problems [6][11] - The most challenging problem, "Problem C," was solved by both AI models, while no university team managed to solve it [4][7] - Google's Gemini model started 10 minutes late but still managed to solve 10 problems, with a total time of 677 minutes, placing it second among university teams [12][16] Problem-Solving Techniques - Gemini's approach to "Problem C" involved assigning priority values to storage units and using dynamic programming to find the optimal configuration for a network of interconnected pipes [14][16] - The success of Gemini was attributed to advancements in pre-training, post-training, reinforcement learning, multi-step reasoning, and parallel thinking [16] Future Directions - OpenAI's research vice president indicated that after ICPC, the focus may shift to applying various scientific and engineering skills to real-world problems, suggesting a new frontier for AI applications [19][20] - The AI's performance in prestigious competitions like ICPC, IMO, and IOI demonstrates its growing capabilities in complex problem-solving [19]
刚刚,OpenAI/Gemini共斩ICPC 2025金牌,OpenAI满分碾压横扫全场
3 6 Ke· 2025-09-18 01:55
Core Insights - Google and OpenAI both achieved gold medals at the ICPC, with OpenAI scoring a perfect score [1][3] - This event marks a historic moment where AI has surpassed human capabilities in a top-level programming competition [4][36] Group 1: Performance Highlights - Gemini solved 10 out of 12 problems, earning a gold medal, while OpenAI solved all problems correctly for a perfect score [1][3][4] - Among 139 human teams, only 3 matched Gemini's score, and no human team achieved a perfect score [4][36] - Gemini successfully solved problem C, which no human team could solve, demonstrating its advanced problem-solving capabilities [7][10] Group 2: Technical Achievements - Gemini utilized a dynamic programming algorithm to find the optimal configuration for problem C, which involved complex liquid distribution through interconnected pipes [10][9] - The model was enhanced for the competition, allowing it to think continuously over the five-hour contest [9][18] Group 3: Implications for AI and Software Development - The success of AI in ICPC signifies its potential as a genuine problem-solving partner for programmers [36][38] - This achievement indicates a shift in AI capabilities from mere information processing to solving complex reasoning problems across various fields [38][36] Group 4: Team and Research Background - OpenAI's team included several ICPC champions, contributing to the development of their advanced models [34][32] - The simultaneous announcement of results by both companies highlights the competitive landscape in AI development [35][36]
ICPC总决赛被AI统治!GPT-5组合系统12题全对登顶,人类打破头只能争夺第三
量子位· 2025-09-18 00:51
Core Insights - The article discusses the impressive performance of AI systems in the 2025 International Collegiate Programming Contest (ICPC), highlighting the dominance of OpenAI's GPT-5 and Google's Gemini 2.5 models in solving complex programming problems [2][9][18]. Group 1: AI Performance in ICPC - OpenAI's system, utilizing GPT-5 and an experimental reasoning model, solved all 12 problems in under five hours, achieving a perfect score [9][10]. - Google's Gemini 2.5 Deep Think model solved 10 out of 12 problems, reaching gold medal level, and ranked second overall [3][18]. - The competition featured 139 top teams from nearly 3,000 universities across 103 countries [5]. Group 2: Problem-Solving Challenges - A particularly difficult problem, "Problem C," was unsolved by any university team, while both Gemini and OpenAI's models successfully tackled it [7][20]. - Gemini's approach involved assigning priority values to storage units and using dynamic programming to find optimal configurations for liquid distribution [25][26]. Group 3: Technological Advancements - The advancements in AI models, particularly in reasoning capabilities, have significantly improved over the past year, making them smarter, faster, and more cost-effective [17]. - Gemini's success is attributed to a combination of pre-training, post-training, novel reinforcement learning techniques, and multi-step reasoning [27][28]. Group 4: Future Directions - OpenAI's research vice president indicated that after ICPC, the focus may shift to applying AI in real-world scientific and engineering problems, suggesting a new frontier for AI applications [30][32].