《宝可梦蓝》 - filings, earnings calls, financial reports, news

《宝可梦蓝》

Search documents

3 6 Ke· 2025-08-27 06:19

Core Insights - GPT-5 has demonstrated superior efficiency in completing the game "Pokémon Crystal," defeating the final boss, Red, in just 9517 steps, significantly fewer than the 27040 steps taken by the previous model, o3 [3][5][11] - The performance of GPT-5 has garnered attention and praise, including acknowledgment from OpenAI's president, Greg Brockman [11] Performance Comparison - GPT-5 completed the main storyline of "Pokémon Crystal" with a total of 9205 steps to collect all 16 badges, while o3 required 22334 steps [5] - In the Elite Four and Champion segment, GPT-5 used only 7329 steps compared to o3's 18115 steps, showcasing a more than twofold efficiency [8] - Overall, GPT-5's total steps to defeat Red were about one-third of o3's, highlighting a significant improvement in gameplay efficiency [3][11] Gameplay Mechanics - GPT-5's success is attributed to fewer "hallucinations," better spatial reasoning, and superior goal planning compared to o3, allowing it to navigate the game world more effectively [14][15] - The model's ability to plan long action sequences with minimal errors has contributed to its rapid progress through the game [15] Benchmarking and Cost - The use of Pokémon games as benchmarks for AI models is noted, with GPT-5's completion of "Pokémon Red" costing approximately $3500 in API credits, indicating the high expense associated with such testing [23] - The integration of various tools and strategies, such as creating a mini-map and self-critique mechanisms, enhances the model's decision-making capabilities in the game [21][25]

AI大模型

宝可梦游戏作为基准测试

Artificial Intelligence

Artificial Intelligence

GPT - 5

《宝可梦水晶》

《宝可梦红》

GPT-5通关《宝可梦水晶》创纪录！9517步击败赤爷，效率碾压o3三倍！

量子位· 2025-08-26 08:11

Core Viewpoint - GPT-5 has demonstrated exceptional performance in completing the game "Pokémon Crystal," defeating the final boss, Red, in significantly fewer steps compared to its predecessor, o3, showcasing advancements in AI capabilities and efficiency in gaming [1][3][21]. Summary by Sections Performance Comparison - GPT-5 completed "Pokémon Crystal" in just 9,517 steps, while o3 took 27,040 steps, indicating that GPT-5 was nearly three times more efficient [3][4]. - The average human player typically takes around 5 days (approximately 40 hours) to complete the game [5]. - In the main storyline, GPT-5 used only 9,205 steps to collect all 16 badges, compared to o3's 22,334 steps [10]. Efficiency in Gameplay - From badge collection to defeating Red, GPT-5 required only 312 steps, while o3 needed nearly 5,000 steps, demonstrating a speed increase of several times [11]. - During the Elite Four and Champion battles, GPT-5 used 7,329 steps, while o3 used over 18,115 steps, again highlighting GPT-5's superior efficiency [14]. AI Model Capabilities - The success of GPT-5 is attributed to its reduced "hallucination" rate, better spatial reasoning, and improved goal planning compared to o3 [21]. - GPT-5's ability to plan longer action sequences with minimal errors has significantly saved time during gameplay [21]. Benchmarking AI Models - The article discusses the trend of AI models, including Google's Gemini and Anthropic's Claude, attempting to play Pokémon games, with varying degrees of success [23][24]. - Pokémon games serve as a benchmark for evaluating AI models' contextual understanding, decision-making, and interface control capabilities [29]. Cost of AI Gaming - The cost of using GPT-5 for gaming is substantial, with estimates suggesting that completing "Pokémon Red" (which is half the length of "Pokémon Crystal") could cost around $3,500 [30]. - The article notes that unless one works at OpenAI, the financial barrier to using Pokémon as a benchmark for AI testing is significant [31].

Artificial Intelligence

Artificial Intelligence

GPT - 5

大模型终于通关《宝可梦蓝》！网友：Gemini 2.5 Pro酷爆了

量子位· 2025-05-03 04:05

Core Viewpoint - Gemini 2.5 Pro has successfully completed the Pokémon Blue game, marking a significant achievement in AI capabilities, particularly in gaming contexts [1][3][18]. Group 1: Achievement and Comparison - Gemini 2.5 Pro is the first large model to become a Pokémon League Champion and enter the Hall of Fame in Pokémon Blue [3]. - In comparison, the previous model, Claude 3.5, struggled to progress in the game, only reaching the forest area, while Claude 3.7 managed to defeat gym leaders but did not complete the game [3][9]. Group 2: Gameplay Process - The gameplay process involved Gemini exploring the game world, specifically aiming to capture Mewtwo in the Cerulean Cave, which required extensive thought and planning, consuming 76,011 tokens for a single action [8][9]. - The model's decision-making process was displayed in real-time, showcasing its reasoning behind each action taken [7][8]. Group 3: Challenges Faced - Despite its success, Gemini's performance highlighted challenges in navigating the game, often getting lost, indicating that AI still struggles with spatial reasoning in low-resolution environments [9][10][12]. - The model's limitations in visual interpretation and context understanding were noted, as it had difficulty recognizing in-game structures and their interactions [11][13][16]. Group 4: Future Implications - The achievement by Gemini suggests a potential shift in benchmarks for evaluating large models, with future assessments possibly focusing on their ability to complete games like Pokémon [19]. - Google plans to continue exploring this area, indicating ongoing developments in AI gaming capabilities [18].