Workflow
GPT Codex 5.3
icon
Search documents
姚顺宇谷歌首秀,Gemini新模型刷爆SOTA:人类仅剩7人捍卫碳基编程
3 6 Ke· 2026-02-13 07:32
Core Insights - The article highlights the impressive performance of Gemini 3 Deep Think, which has achieved significant milestones in various benchmark tests, showcasing its advanced reasoning capabilities and potential applications in scientific and engineering fields [3][15][19]. Group 1: Performance Metrics - Gemini 3 Deep Think scored an unprecedented 84.6% on the ARC-AGI-2 benchmark, surpassing previous models that scored between 60%-70% [3][19]. - In the Humanity's Last Exam (HLE), it achieved a new state-of-the-art (SOTA) score of 48.4% [3][15]. - The model also reached a remarkable Elo score of 3455 on Codeforces, ranking it as the 8th best globally, with only 7 individuals scoring higher [1][15]. Group 2: Cost Efficiency - The upgrade of Gemini 3 Deep Think has led to an 82% reduction in reasoning costs, decreasing from $77.16 to $13.62 per task [21][15]. Group 3: Applications and Innovations - Gemini 3 Deep Think has demonstrated capabilities in analyzing sketches, modeling complex shapes, and generating files for 3D printing, indicating its utility in engineering tasks [7][15]. - The model successfully identified a subtle logical flaw in a complex mathematical paper, which had been overlooked in prior peer reviews, showcasing its potential in academic research [9][15]. - It optimized a method for growing complex crystals, achieving a precision previously unattainable, which could lead to new semiconductor materials [10][15]. Group 4: Team and Development - The development team of Gemini 3 Deep Think includes notable figures such as Yi Tay and Shunyu Yao, both of whom have significant backgrounds in AI and physics [27][28].
姚顺宇谷歌首秀,Gemini新模型刷爆SOTA:人类仅剩7人捍卫碳基编程
量子位· 2026-02-13 05:42
听雨 发自 凹非寺 量子位 | 公众号 QbitAI 面对Claude Opus 4.6和GPT Codex 5.3的猛烈攻势,谷歌反手就是一个 Gemini 3 Deep Think 的重大升级。 | | Who | # | = | | --- | --- | --- | --- | | 1 | _ Benq | IEE | 3792 | | 2 | 롤 ecnerwala | 214 | 3715 | | 3 | jiangly | 192 | 3664 | | 4 | VivaciousAubergine | 65 | 3646 | | 5 | Kevin114514 | 104 | 3604 | | б | tourist | 297 | 3592 | | 7 | strapple | ୧୧ | 3486 | | 8 | dXqwq | 83 | 3436 | | ਰੇ | · maroonrk | 208 | 3423 | | 10 | Otomachi_Una | 62 | 3413 | | 11 | မြ ksun48 | 313 | 3408 | | 12 | heuristica | ୧୫ ...