刚刚，谷歌「IMO金牌」模型上线Gemini，数学家第一时间证明猜想

Core Viewpoint - Google has launched the Deep Think feature for Google AI Ultra subscribers, utilizing the Gemini 2.5 Deep Think model, which has shown significant improvements over earlier versions and is designed to assist researchers and mathematicians in solving complex problems [1][3][4]. Summary by Sections Model Improvements - The Gemini 2.5 Deep Think model has been enhanced based on feedback from early testers and research breakthroughs, showing notable improvements since its initial release at the I/O conference [3]. - This model variant is derived from the one that won a gold medal at the International Mathematical Olympiad (IMO), and it has been optimized for faster reasoning and better user experience [4]. User Experience - Google AI Ultra subscribers can access Deep Think through the Gemini app by selecting the 2.5 Pro model and switching to "Deep Think" in the prompt bar [6]. - The model integrates with tools like code execution and Google Search, allowing for longer and more detailed responses [6]. Performance Metrics - Deep Think has achieved impressive results in various benchmarks: 34.8% in HLE (without external tools), 87.6% in Live Code Bench V6, 60.7% in IMO 2025, and 99.2% in AIME 2025, showcasing its strong reasoning capabilities in complex problem-solving and programming [18][20]. Problem-Solving Capabilities - The model employs parallel thinking techniques to generate multiple ideas simultaneously, allowing it to explore different hypotheses and arrive at creative solutions over extended reasoning periods [12]. - Deep Think excels in tasks requiring creativity and strategic planning, such as iterative development and design, where it can enhance both aesthetics and functionality with a single prompt [14]. Future Developments - Google plans to release Deep Think with and without tools via the Gemini API to trusted testers in the coming weeks, aiming to better understand its usability in developer and enterprise contexts [11]. - The company is also focused on enhancing the safety and security of the Gemini model during its training and deployment phases, with improvements in content safety and objectivity compared to previous versions [20].