谷歌突发Gemini 3.1 Pro!首次采用「.1」版本号,推理性能×2的那种
量子位·2026-02-20 01:28

Core Viewpoint - The article discusses the significant upgrades of Google's Gemini 3.1 Pro model compared to its predecessor, Gemini 3 Pro, highlighting improvements in multimodal generation, semantic understanding, and reasoning capabilities [1][9][10]. Group 1: Model Upgrades - Gemini 3.1 Pro shows a noticeable enhancement in multimodal generation and semantic understanding, achieving a higher level of performance [1]. - The model can convert everyday data into interactive visual content, such as aerospace dashboards and city simulations [3][5]. - In the ARC-AGI-2 benchmark test, Gemini 3.1 Pro achieved a verification score of 77.1%, which is double that of Gemini 3 Pro [10]. Group 2: Performance Metrics - The performance comparison table indicates that Gemini 3.1 Pro outperforms other models in various benchmarks, including academic reasoning and abstract reasoning puzzles [11]. - The overall ranking score of Gemini 3.1 Pro in Arena's evaluation is 13 points higher than that of Gemini 3 Pro, with significant improvements in text and code dimensions [12]. - The model supports a context length of 1 million tokens and has a knowledge cutoff date of January 2025, enhancing its multimodal understanding and long-context performance [11]. Group 3: User Experience and Applications - Users have reported positive experiences with Gemini 3.1 Pro, generating complex visualizations and interactive applications, such as a 3D simulation of a flock of birds [17][20]. - The model has been utilized to create personal websites and educational applications, showcasing its versatility and advanced capabilities [24][25]. - The model is now available in Gemini applications and APIs, with specific access for Google AI Pro and Ultra users [29]. Group 4: Cost and Market Implications - The release of Gemini 3.1 Pro marks Google's first use of a ".1" version number, indicating a rapid pace of development in large models [30]. - The pricing for Gemini 3.1 Pro remains competitive, with input costs at $2 for less than 200k tokens and $4 for more, while output costs are $4 for less than 200k tokens and $18 for more [36]. - The cost per ARC-AGI-2 task is approximately $0.96, significantly lower than the previous model, suggesting a shift in the cost-performance curve in AI development [37][41].

谷歌突发Gemini 3.1 Pro!首次采用「.1」版本号,推理性能×2的那种 - Reportify