Workflow
先别急着给OpenAI加冕!陶哲轩:这种「金牌」,含金量取决于「赛制」
机器之心·2025-07-20 03:11

Core Viewpoint - OpenAI's new reasoning model achieved a gold medal level performance in the International Mathematical Olympiad (IMO), solving five out of six problems and scoring 35 out of 42 points, which has generated excitement in the AI community [2][6][10]. Group 1: Model Performance - The model was tested under strict conditions, mirroring human competitors, without any tools or internet assistance during the two 4.5-hour exam sessions [3][6]. - The announcement of OpenAI's model's success came after other AI models, such as Gemini 2.5 Pro and OpenAI's o3, performed poorly, scoring only 13 and 7 points respectively [10]. Group 2: Expert Opinions - Mathematician Terence Tao urged caution regarding the interpretation of AI models' IMO results, emphasizing the need for standardized testing conditions to make meaningful comparisons between AI and human performance [11][15]. - Tao highlighted that AI capabilities can vary significantly based on the resources and methods used during testing, suggesting that the reported results may not reflect true performance [15][18]. Group 3: Model Development and Future - OpenAI's reasoning research lead, Noam Brown, acknowledged that there is still considerable room for improvement in the model's computational capabilities and efficiency during testing [34]. - The model that achieved the IMO gold medal is not GPT-5, and its release may take several more months [34]. Group 4: Research Background - Alexander Wei, who led the development of the model, has a strong background in enhancing reasoning capabilities in large language models, particularly in mathematical reasoning and natural language proof generation [37][38]. - Wei has previously achieved recognition in the International Olympiad in Informatics and has contributed to AI systems that reached human-level performance in strategic games [40].