哥德尔测试
Search documents
GPT-5通过“哥德尔测试”!独创性解决博士生都得花几天时间的开放数学问题
量子位· 2025-09-25 13:00
henry 发自 凹非寺 量子位 | 公众号 QbitAI GPT-5,你这家伙! 究竟还有什么事是我不知道的? 在一篇最新论文中,研究人员让它挑战了5个尚未解决的优化猜想。 结果它居然解出了其中3个! 更令人吃惊的是,其中有一道题,它甚至给出了与研究者预期完全不同的、同样有效的证明方案。 它可不是"笨蛋"研究生,而是能展现出独创性的"聪明"博士生。 前微软研究副总裁、现OpenAI科学家Sebastien Bubeck表示: 和国际数学奥林匹克(IMO)那些为"人类天才高中生"准备的题目不同,这次的测试题需要博士水平的研究者花上几天才能完成。 在论文里,研究者们还特意"挑衅" 陶哲轩 对大语言模型数学能力的印象—— 这意味着GPT-5能够解决一些真正的开放性数学问题。 接下来,就让我们看看,这位AI数学天才是怎么炼成的。 "哥德尔"测试 如上所述,GPT-5这次挑战的并不是奥赛题,而是高等数学里的简单猜想。 求解这类问题不仅需要算术能力,还需要相当强的数学背景和逻辑推理能力。 研究人员把他们的测试称为: 哥德尔测试 。 哥德尔测试里的问题需要人自己动脑、经过训练才能解决,而且在现有文献中找不到现成答案。 ( ...
刚刚,GPT-5首次通过“哥德尔测试”,破解三大数学猜想
3 6 Ke· 2025-09-25 07:36
Core Insights - GPT-5 has successfully passed the Gödel test by solving three major combinatorial optimization conjectures, showcasing a significant advancement in AI's mathematical capabilities [1][8]. Group 1: Breakthrough Achievements - GPT-5's ability to independently overturn existing conjectures and provide new effective solutions has astonished OpenAI researchers, marking a historic moment for AI [1][8]. - The AI demonstrated near-perfect solutions to three relatively simple problems, proving its strong logical reasoning skills [4][8]. Group 2: Research Context - The research, led by Haifa University and Cisco, aimed to challenge AI with open mathematical conjectures, a task typically requiring days for top PhD students to solve [3][14]. - The study focused on combinatorial optimization, selecting problems that are specific and have clear motivations, while ensuring they remain within the scope of mathematical reasoning [14][15]. Group 3: Problem-Solving Methodology - Five conjectures were designed for the AI to tackle, with minimal descriptions and 1-2 reference papers provided for context [15][16]. - The difficulty level was set such that excellent undergraduates or graduate students could solve all problems within a day, ensuring most problems had clear conjectures and known solution paths [16]. Group 4: Specific Conjectures Solved - Conjecture 1 involved maximizing a submodular function under convex constraints, where GPT-5 applied a continuous Frank-Wolfe approach to derive a solution [20][22]. - Conjecture 2 focused on a p-system constrained "dual-index" algorithm, where GPT-5 proposed a simple yet effective greedy selection process to achieve near-optimal value [25][31]. - Conjecture 3 dealt with maximizing a γ-weak DR submodular function under convex constraints, where GPT-5 utilized the Frank-Wolfe method to enhance the approximation ratio [32][36]. Group 5: Performance Evaluation - GPT-5 performed well when the problems had clear, singular reasoning paths, successfully providing nearly correct proofs for three out of five conjectures [41]. - However, it struggled with integrating different proofs, indicating a lack of comprehensive reasoning ability, which remains a significant shortcoming [44].