GPT-5通过“哥德尔测试”！独创性解决博士生都得花几天时间的开放数学问题

Core Viewpoint - GPT-5 has demonstrated the ability to solve complex mathematical optimization problems, achieving success in three out of five challenges presented by researchers, showcasing its advanced mathematical reasoning capabilities [2][21]. Group 1: GPT-5's Performance - In a recent study, GPT-5 was tasked with solving five unsolved optimization conjectures, successfully solving three of them [2][21]. - The challenges required a level of mathematical understanding typically expected from PhD-level researchers, rather than high school students [3][21]. - GPT-5's performance included generating a novel proof for one problem that differed from the researchers' expectations but was still valid [2][21]. Group 2: The Gödel Test - The researchers referred to their assessment as the "Gödel Test," which involved problems that required deep reasoning and could not be easily found in existing literature [10][11]. - The problems primarily focused on submodular maximization, a concept in combinatorial mathematics characterized by diminishing returns [12][13]. Group 3: Problem-Solving Details - For the first problem, GPT-5 was required to maximize a function composed of both monotonic and non-monotonic submodular functions under specific constraints, and it provided a performance guarantee [23][24]. - In the second problem, GPT-5 was tasked with maximizing a monotonic submodular function while adhering to complex constraints, yielding a solution that was more reasonable than initially anticipated [39][40]. - The third problem involved maximizing a continuous monotonic function under convex constraints, where GPT-5's response was generally correct but contained minor issues [59][60]. Group 4: Limitations and Challenges - GPT-5 struggled with the fourth and fifth problems, which required integrating insights from multiple sources, highlighting its limitations in comprehensive reasoning [26][73]. - In the fourth problem, GPT-5 failed to provide a valid solution and merely restated known information, while in the fifth problem, its output was deemed unreliable and unusable [70][81]. Group 5: Overall Assessment - Overall, GPT-5 exhibited significant improvements in basic mathematical capabilities compared to earlier models, particularly in combinatorial optimization [26][41]. - The model's performance was influenced by the prompts provided, with more detailed requests leading to more complete and coherent answers [26][62].