大模型终于通关《宝可梦蓝》！网友：Gemini 2.5 Pro酷爆了

Core Viewpoint - Gemini 2.5 Pro has successfully completed the Pokémon Blue game, marking a significant achievement in AI capabilities, particularly in gaming contexts [1][3][18]. Group 1: Achievement and Comparison - Gemini 2.5 Pro is the first large model to become a Pokémon League Champion and enter the Hall of Fame in Pokémon Blue [3]. - In comparison, the previous model, Claude 3.5, struggled to progress in the game, only reaching the forest area, while Claude 3.7 managed to defeat gym leaders but did not complete the game [3][9]. Group 2: Gameplay Process - The gameplay process involved Gemini exploring the game world, specifically aiming to capture Mewtwo in the Cerulean Cave, which required extensive thought and planning, consuming 76,011 tokens for a single action [8][9]. - The model's decision-making process was displayed in real-time, showcasing its reasoning behind each action taken [7][8]. Group 3: Challenges Faced - Despite its success, Gemini's performance highlighted challenges in navigating the game, often getting lost, indicating that AI still struggles with spatial reasoning in low-resolution environments [9][10][12]. - The model's limitations in visual interpretation and context understanding were noted, as it had difficulty recognizing in-game structures and their interactions [11][13][16]. Group 4: Future Implications - The achievement by Gemini suggests a potential shift in benchmarks for evaluating large models, with future assessments possibly focusing on their ability to complete games like Pokémon [19]. - Google plans to continue exploring this area, indicating ongoing developments in AI gaming capabilities [18].