Gemini 3 Deepthink
Search documents
11位顶尖数学家发了篇没结果的论文,陶哲轩推荐都关注一下
猿大侠· 2026-02-11 04:11
Core Insights - The article discusses an AI experiment initiated by 11 leading mathematicians to test AI's ability to solve research-level mathematical problems, focusing on the intersection of AI and mathematics [1][6][29] Group 1: Experiment Overview - The experiment, named "First Proof," aims to evaluate whether current AI systems can independently solve complex mathematical problems [6][29] - The mathematicians designed 10 research-level problems covering various branches of mathematics, including combinatorial algebra and algebraic topology, after filtering from an initial set of 20 problems [10][18] - The problems are derived from the authors' own research and have not been published elsewhere, ensuring no data contamination [18][26] Group 2: AI Capabilities and Limitations - Initial tests with AI systems like GPT 5.2 Pro and Gemini 3 Deepthink showed that these systems struggled to solve most of the proposed problems in a single attempt [24] - The mathematicians believe that allowing iterative dialogue between humans and AI could improve the quality of AI's responses [25] Group 3: Future Directions - The mathematicians plan to design a second set of problems in the coming months, aiming to refine the experimental design and expand the scope of testing [28] - The ultimate goal is to develop "First Proof" into a reusable benchmark for assessing mathematical capabilities of AI, moving towards a collaborative future between mathematicians and AI [29][30]
11位顶尖数学家发了篇没结果的论文,陶哲轩推荐都关注一下
量子位· 2026-02-08 04:46
Core Viewpoint - A new AI experiment initiated by 11 top mathematicians aims to test AI's ability to solve research-level mathematical problems, exploring the boundaries of "AI + Mathematics" [1][6]. Group 1: Experiment Overview - The experiment, named "First Proof," involves AI solving 10 research-level math problems that mathematicians have encountered in their work [6]. - The problems cover various branches of mathematics, including combinatorial algebra, graph theory, algebraic topology, stochastic analysis, and symplectic geometry [10]. - Initially, 20 problems were proposed, but only 10 were selected based on four criteria, ensuring AI can understand the problem statement and that there are no hidden answers [10][17]. Group 2: AI Capabilities and Limitations - Current AI systems, when tested with a single attempt, struggled to solve most of the proposed problems [24]. - The mathematicians believe that allowing human-AI interaction could improve AI's performance in providing better answers [25]. - The experiment aims to assess AI's ability to complete rigorous mathematical proofs, rather than its capacity to generate new theories or definitions [23]. Group 3: Data Integrity and Future Plans - To minimize data contamination, the experiment restricts data sharing options and ensures that the answers remain confidential during the testing phase [26][27]. - Future plans include designing a second set of problems and refining the experimental design to create a reusable and comparable benchmark for research-level mathematical capabilities [28]. - The ultimate goal is to foster human-AI collaboration in mathematics, rather than AI replacing mathematicians [29].