谷歌模型重大升级！清华系姚顺宇参与

Core Viewpoint - Google has released a significant upgrade to its Gemini3 DeepThink model, designed to tackle complex tasks in scientific and engineering fields, showcasing its potential to solve various challenges in these areas [3][5]. Group 1: Model Performance - DeepThink has set new records in multiple benchmark tests, outperforming competitors such as ClaudeOpus4.6 and GPT-5.2. Specifically, it achieved a score of 48.4% in the "Human Last Exam" test, surpassing ClaudeOpus4.6's 40% and GPT-5.2's 34.5% [6]. - In the ARC-AGI-2 test, DeepThink scored an unprecedented 84.6%, while previous top models scored between 60%-70%, with ClaudeOpus4.6 at 68.8% [6]. - DeepThink's Elo rating on the Codeforces competitive programming platform is 3455, placing it among the top 8 globally in programming skills [6]. Group 2: Practical Applications - The model has demonstrated exceptional performance in chemistry and physics, achieving gold medal-level results in the written portion of the 2025 International Physics and Chemistry Olympiad [6]. - DeepThink has been utilized to identify subtle logical flaws in highly specialized mathematical papers, which were previously overlooked by human peer reviewers [8]. - It has also been applied to optimize complex crystal growth methods and explore new semiconductor materials, with successful outcomes reported in various academic settings [8]. Group 3: User Engagement and Accessibility - Google has made DeepThink available through the Gemini application, with access granted to Google AI Ultra subscribers and select researchers, engineers, and enterprises via the Gemini API [10]. - The model's capabilities have sparked significant interest and astonishment among professionals, particularly regarding its performance in abstract reasoning tasks [9].