通用推理能力

Search documents
OpenAI拿下IOI金牌,仅次于前五名人类选手!参赛推理模型才夺得IMO金牌
创业邦· 2025-08-12 03:33
Core Viewpoint - OpenAI's reasoning model achieved a gold medal score at the 2025 International Olympiad in Informatics (IOI), ranking first among AI participants and demonstrating significant advancements in general reasoning capabilities [2][9][16]. Group 1: Competition Performance - OpenAI participated in the online AI track of IOI 2025, scoring just behind five human competitors among 330 participants, securing the top position among AI competitors [6][8]. - The model used by OpenAI was not specifically trained for IOI but was based on a general reasoning model that performed exceptionally well [8][14]. - Compared to last year's performance, OpenAI's score improved dramatically from the 49th percentile to the 98th percentile, showcasing a leap in capabilities [9]. Group 2: Model and Strategy - OpenAI utilized the same model that won gold at the International Mathematical Olympiad (IMO) 2025 without any modifications for the IOI competition [14][15]. - The strategy involved sampling answers from different models and using a heuristic method to select submissions, which contributed to the successful outcome [14]. Group 3: Community Reaction and Future Implications - The achievement has sparked excitement in the community, highlighting the growing strength of general reasoning abilities without specialized training [16]. - There is anticipation for OpenAI to release a public version of the technology that led to the gold medal performance, indicating potential for further advancements in AI capabilities [18].
刚刚,OpenAI拿下IOI金牌,仅次于前五名人类选手!参赛推理模型才夺得IMO金牌
机器之心· 2025-08-12 00:15
Core Insights - OpenAI's reasoning model achieved a gold medal score at the 2025 International Olympiad in Informatics (IOI), ranking first among AI participants [1][5][9] - The model's performance marked a significant improvement from the previous year, rising from the 49th percentile to the 98th percentile [9] - OpenAI utilized a general reasoning model without specific training for the IOI, demonstrating the strength of its general reasoning capabilities [15][14] Group 1 - The 2025 IOI took place in Sucre, Bolivia, from July 27 to August 3, with the Chinese team winning all gold medals [1] - OpenAI's model scored just behind five human competitors among 330 participants, adhering to the same constraints as human contestants [5][6] - The model did not use the internet or retrieval-augmented generation (RAG), relying solely on a basic terminal tool [6] Group 2 - OpenAI's performance in recent competitions, including AtCoder and IMO, showcases the advancements made through new research methods [9] - The model used for IOI was the same as the one that won gold at the IMO, indicating its versatility across different competitive domains [14] - The strategy involved sampling answers from various models and using heuristic methods to select submissions, leading to a top-six finish overall [14] Group 3 - OpenAI's co-founder Greg Brockman praised the model's "gold medal-level performance" at the IOI [13] - The success of the model without specialized training has sparked discussions about its capabilities and potential future applications [15][17] - There is anticipation for a public version of the model that could leverage the techniques used in the recent competitions [17]
OpenAI IMO金牌团队爆料:AI拒绝作答第六题
机器之心· 2025-08-03 04:21
Core Insights - The OpenAI team achieved a significant milestone by winning a gold medal at the International Mathematical Olympiad (IMO) with a model developed by a core team of just three members [2][3][6] - The project was initiated with discussions dating back to 2021, but focused development occurred only in the last two to three months before the competition [8][9] - The model's unique mathematical proof style was described as both "atrocious" and "creative," highlighting its complexity and lack of human readability [11] Project Timeline and Team Structure - The project aimed at winning the IMO gold medal has been a long-term goal for OpenAI, with serious discussions starting in 2021 [8] - The core team consists of Alexander Wei, Sheryl Hsu, and Noam Brown, with Wei leading the technical development [10] Model Performance and Challenges - The model faced challenges with complex problems, such as the sixth question of the IMO, where it chose not to answer, indicating an understanding of its limitations [12] - The team expressed that while they are excited about their progress, significant challenges remain in solving more complex mathematical problems, such as the Millennium Prize Problems [13][14] Technical Aspects and Future Directions - The project utilized a scalable parallel computing approach, emphasizing the importance of generality over specialized systems [16] - The team opted not to use formal proof tools like Lean, focusing instead on developing general reasoning capabilities applicable to real-world problems [17] - The infrastructure for the project was similar to other recent OpenAI products, reinforcing the general applicability of the developed techniques [18] Future Applications and Challenges - The team hopes to make the model available for mathematicians, with ongoing research into how this can be achieved [21] - Acknowledging the difficulty of generating interesting questions, the team identified this as a future challenge for AI [19]