Core Insights - The article discusses the challenges faced by AI models, particularly Anthropic's Claude, in playing the children's game Pokémon, highlighting a significant gap in AI capabilities compared to human players [2][3][8] - The performance of Google's Gemini model in successfully completing a Pokémon game is attributed to its superior toolset rather than inherent intelligence [5][8] - The article emphasizes the importance of long-term memory and continuous reasoning in AI, which are currently lacking in existing models [6][8] Group 1: AI Performance in Pokémon - Claude's attempts to play Pokémon resulted in numerous failures, including getting stuck for hours and making basic mistakes that a child would not [2][3] - In contrast, Google's Gemini 2.5 Pro successfully completed a Pokémon game, showcasing the impact of a more advanced toolset that enhances AI capabilities [5] - The differences in toolsets between Claude and Gemini highlight how essential external capabilities are for AI performance in complex tasks [5][8] Group 2: Limitations of AI Models - The article points out that AI struggles with tasks requiring sustained reasoning and memory over time, which are essential for success in games like Pokémon [6][8] - Despite advancements, AI models like Claude and Gemini still face significant challenges in executing long-term goals and maintaining context over extended periods [8][11] - The article notes that while AI can excel in specific tasks, such as exams and coding competitions, it still falls short in dynamic and open-ended environments like gaming [8][11] Group 3: Broader Implications for AI Development - The challenges faced in Pokémon are indicative of broader issues in the pursuit of Artificial General Intelligence (AGI), where AI models struggle with complex, multi-faceted tasks [11][24] - The article suggests that Pokémon has become an informal benchmark for evaluating AI capabilities, as it allows for long-term tracking of reasoning and decision-making processes [24] - The ongoing difficulties encountered by AI in games like Pokémon illustrate the limitations of current models and the need for further advancements in AI technology [24]
全球顶尖大模型,通关不了“宝可梦”:这些游戏都是AI的噩梦