爆冷,首届大模型争霸,Grok 4下出“神之一手”?DeepSeek、Kimi惨遭淘汰
Seek .Seek .(US:SKLTY) 3 6 Ke·2025-08-07 01:16

Group 1 - The core event is the first global AI chess championship organized by Google's Kaggle, featuring eight top language models competing against each other [1][3] - The competition includes both closed-source models like Gemini 2.5 Pro and OpenAI's o4-mini, and open-source models like DeepSeek R1 and Kimi K2 Instruct [1] - The tournament format is a knockout stage, with the first round resulting in four models advancing with a dominant 4-0 score [2][3] Group 2 - The semi-finals are set to take place the following day, featuring matchups between OpenAI's o3-mini and o3, and Gemini 2.5 Pro against Grok 4 [5] - The competition is hosted on a specially designed platform called "Game Arena," which aims to evaluate the models' performance in a gaming context [3][21] - The significance of the tournament extends beyond chess skills, serving as a test for AI's overall understanding and reasoning capabilities [21][22] Group 3 - Kimi K2 was disqualified due to illegal moves, while o3 advanced without contest [9][10] - DeepSeek R1 struggled in the middle game, leading to its defeat against o4-mini, which maintained a steady performance [11][13] - Claude 4 Opus fought hard but ultimately lost to Gemini 2.5 Pro after making a critical mistake [14][15] Group 4 - Grok 4 demonstrated exceptional performance, effectively identifying and exploiting weaknesses in its opponent, Gemini 2.5 Flash, winning decisively [17][20] - The tournament is seen as a testing ground for AI's strategic reasoning and adaptability in complex scenarios [21][22] - Kaggle's evaluation criteria include hundreds of unpublicized matches, indicating that the current tournament is just an initial assessment of general intelligence [22]