《宝可梦红》

Search documents
GPT-5通关《宝可梦水晶》创纪录,9517步击败赤爷,效率碾压o3三倍
3 6 Ke· 2025-08-27 06:19
Core Insights - GPT-5 has demonstrated superior efficiency in completing the game "Pokémon Crystal," defeating the final boss, Red, in just 9517 steps, significantly fewer than the 27040 steps taken by the previous model, o3 [3][5][11] - The performance of GPT-5 has garnered attention and praise, including acknowledgment from OpenAI's president, Greg Brockman [11] Performance Comparison - GPT-5 completed the main storyline of "Pokémon Crystal" with a total of 9205 steps to collect all 16 badges, while o3 required 22334 steps [5] - In the Elite Four and Champion segment, GPT-5 used only 7329 steps compared to o3's 18115 steps, showcasing a more than twofold efficiency [8] - Overall, GPT-5's total steps to defeat Red were about one-third of o3's, highlighting a significant improvement in gameplay efficiency [3][11] Gameplay Mechanics - GPT-5's success is attributed to fewer "hallucinations," better spatial reasoning, and superior goal planning compared to o3, allowing it to navigate the game world more effectively [14][15] - The model's ability to plan long action sequences with minimal errors has contributed to its rapid progress through the game [15] Benchmarking and Cost - The use of Pokémon games as benchmarks for AI models is noted, with GPT-5's completion of "Pokémon Red" costing approximately $3500 in API credits, indicating the high expense associated with such testing [23] - The integration of various tools and strategies, such as creating a mini-map and self-critique mechanisms, enhances the model's decision-making capabilities in the game [21][25]
GPT-5通关《宝可梦水晶》创纪录!9517步击败赤爷,效率碾压o3三倍!
量子位· 2025-08-26 08:11
Core Viewpoint - GPT-5 has demonstrated exceptional performance in completing the game "Pokémon Crystal," defeating the final boss, Red, in significantly fewer steps compared to its predecessor, o3, showcasing advancements in AI capabilities and efficiency in gaming [1][3][21]. Summary by Sections Performance Comparison - GPT-5 completed "Pokémon Crystal" in just 9,517 steps, while o3 took 27,040 steps, indicating that GPT-5 was nearly three times more efficient [3][4]. - The average human player typically takes around 5 days (approximately 40 hours) to complete the game [5]. - In the main storyline, GPT-5 used only 9,205 steps to collect all 16 badges, compared to o3's 22,334 steps [10]. Efficiency in Gameplay - From badge collection to defeating Red, GPT-5 required only 312 steps, while o3 needed nearly 5,000 steps, demonstrating a speed increase of several times [11]. - During the Elite Four and Champion battles, GPT-5 used 7,329 steps, while o3 used over 18,115 steps, again highlighting GPT-5's superior efficiency [14]. AI Model Capabilities - The success of GPT-5 is attributed to its reduced "hallucination" rate, better spatial reasoning, and improved goal planning compared to o3 [21]. - GPT-5's ability to plan longer action sequences with minimal errors has significantly saved time during gameplay [21]. Benchmarking AI Models - The article discusses the trend of AI models, including Google's Gemini and Anthropic's Claude, attempting to play Pokémon games, with varying degrees of success [23][24]. - Pokémon games serve as a benchmark for evaluating AI models' contextual understanding, decision-making, and interface control capabilities [29]. Cost of AI Gaming - The cost of using GPT-5 for gaming is substantial, with estimates suggesting that completing "Pokémon Red" (which is half the length of "Pokémon Crystal") could cost around $3,500 [30]. - The article notes that unless one works at OpenAI, the financial barrier to using Pokémon as a benchmark for AI testing is significant [31].