AI玩宝可梦找出30年前代码Bug！谷歌论文介绍AI通关全过程，复杂任务都能解

Core Insights - Google has released a technical report on the Gemini 2.5 series, highlighting its capabilities in playing Pokémon, with a significant focus on its behavior during gameplay, including instances of "panic" when the AI character is near death [1][2][3] Group 1: AI Performance in Gaming - Gemini 2.5 Pro demonstrated remarkable gaming skills, completing the Pokémon game and becoming the champion, although it took 831 hours initially, which is significantly longer than human players [5] - The AI exhibited impressive problem-solving abilities, such as escaping a soft lock due to a game bug by using a skill not typically associated with the situation, indicating its creative reasoning [9] - The AI's long-term planning was evident when it spent over 24 hours leveling up Pokémon to defeat a challenging opponent, showcasing its strategic thinking [10] Group 2: Complex Task Management - Gemini 2.5 Pro successfully navigated complex tasks, such as acquiring hidden skills that require multiple steps, demonstrating its ability to manage numerous sub-tasks effectively [12][13] - The AI faced challenges in specific game areas, such as the "Hunting Zone," where it initially failed 17 times but improved to 5 attempts in a subsequent run, indicating learning and optimization [14] - In navigating mazes and dungeons, the AI had to remember locations and manage resources, showcasing its memory and spatial reasoning capabilities [16][18] Group 3: AI Limitations and Challenges - The AI exhibited issues such as confusion between different game versions, leading to unnecessary searches for non-existent items, which highlights its limitations in contextual understanding [27] - Instances of "context poisoning" were noted, where the AI acted on incorrect information, demonstrating a lack of critical reasoning in certain scenarios [29] - The AI often fell into "fixed mindset traps," where it pursued seemingly straightforward paths that led to dead ends, indicating a need for improved adaptability in problem-solving [30] Group 4: Ongoing Developments - The live streaming project of AI playing Pokémon continues, with Claude 4 joining the competition against Gemini 2.5 Pro, indicating ongoing advancements in AI gaming capabilities [31] - Gemini 2.5 Pro has already progressed to the next game, Pokémon Yellow, in its challenging mode, reflecting its continuous development and application in gaming [34]