Workflow
刚刚,GPT-5首次通过“哥德尔测试”,破解三大数学猜想
3 6 Ke·2025-09-25 07:36

Core Insights - GPT-5 has successfully passed the Gödel test by solving three major combinatorial optimization conjectures, showcasing a significant advancement in AI's mathematical capabilities [1][8]. Group 1: Breakthrough Achievements - GPT-5's ability to independently overturn existing conjectures and provide new effective solutions has astonished OpenAI researchers, marking a historic moment for AI [1][8]. - The AI demonstrated near-perfect solutions to three relatively simple problems, proving its strong logical reasoning skills [4][8]. Group 2: Research Context - The research, led by Haifa University and Cisco, aimed to challenge AI with open mathematical conjectures, a task typically requiring days for top PhD students to solve [3][14]. - The study focused on combinatorial optimization, selecting problems that are specific and have clear motivations, while ensuring they remain within the scope of mathematical reasoning [14][15]. Group 3: Problem-Solving Methodology - Five conjectures were designed for the AI to tackle, with minimal descriptions and 1-2 reference papers provided for context [15][16]. - The difficulty level was set such that excellent undergraduates or graduate students could solve all problems within a day, ensuring most problems had clear conjectures and known solution paths [16]. Group 4: Specific Conjectures Solved - Conjecture 1 involved maximizing a submodular function under convex constraints, where GPT-5 applied a continuous Frank-Wolfe approach to derive a solution [20][22]. - Conjecture 2 focused on a p-system constrained "dual-index" algorithm, where GPT-5 proposed a simple yet effective greedy selection process to achieve near-optimal value [25][31]. - Conjecture 3 dealt with maximizing a γ-weak DR submodular function under convex constraints, where GPT-5 utilized the Frank-Wolfe method to enhance the approximation ratio [32][36]. Group 5: Performance Evaluation - GPT-5 performed well when the problems had clear, singular reasoning paths, successfully providing nearly correct proofs for three out of five conjectures [41]. - However, it struggled with integrating different proofs, indicating a lack of comprehensive reasoning ability, which remains a significant shortcoming [44].