Claude开发者平台
Search documents
突发,Claude Opus 4.5编程世界第一,把谷歌OpenAI踢下王座
3 6 Ke· 2025-11-25 03:33
Core Insights - The release of Claude Opus 4.5 marks a significant advancement in AI capabilities, particularly in programming and computer usage, surpassing competitors like Gemini 3 Pro and GPT-5.1 [1][3][22] - Opus 4.5 has achieved state-of-the-art (SOTA) results in various benchmarks, indicating its superiority in coding, tool usage, and reasoning abilities [3][21][22] Performance Metrics - In the SWE-bench Verified test, Opus 4.5 scored 80.9%, outperforming Sonnet 4.5 (77.2%) and Opus 4.1 (74.5%), while also exceeding Gemini 3 Pro (76.2%) and GPT-5.1 (77.9%) [2][23] - Opus 4.5 achieved a 66.3% score in computer use, significantly higher than Opus 4.1 (44.4%) [2][23] - The model demonstrated a 37.6% score in the ARC-AGI-2 evaluation, showcasing its advanced reasoning capabilities [4][22] Productivity Enhancements - Internal evaluations indicated that using Opus 4.5 in conjunction with Claude Code resulted in an average productivity increase of 220%, with 50% of users reporting at least a 100% improvement [9][10] - Opus 4.5 is described as a "near-complete entry-level researcher replacement" by some users, highlighting its potential to transform research workflows [9][10] Cost and Accessibility - The pricing for Opus 4.5 has significantly decreased, with input costs at $5 per million tokens and output costs at $25 per million tokens, making it more accessible for widespread use [11][13][71] Tool and Feature Enhancements - Opus 4.5 introduces new features such as the "Plan Mode" for better task planning and execution, and improved capabilities for handling complex tasks in Excel and other applications [47][75] - The model's ability to manage multiple concurrent tasks has been enhanced, allowing for more efficient workflows [48][56] Safety and Alignment - Opus 4.5 is noted for being the most robust and aligned model released by Anthropic, with significant improvements in resisting prompt injection attacks compared to previous models [40][43]