高级工具调用能力 - filings, earnings calls, financial reports, news

高级工具调用能力

Search documents

Claude Opus 4.5夺回编程王座，超Gemini 3 Pro和GPT-5.1

AI前线· 2025-11-25 05:03

Core Insights - Anthropic's Claude Opus 4.5 has surpassed competitors in coding, agent capabilities, and computer operations, achieving top scores in various benchmarks, outpacing GPT-5.1 and Gemini 3 Pro [2][14][21] Performance Metrics - Claude Opus 4.5 achieved 80.9% in SWE-bench Verified, 59.3% in Agentic terminal coding, and 88.9% in Agentic tool use, outperforming previous versions and competitors [5][14] - In a two-hour high-pressure exam, Claude Opus 4.5 scored the highest ever, surpassing all human candidates, demonstrating its ability to understand complex codebases and identify bugs under ambiguous instructions [6][16][17] Pricing Structure - The latest pricing for Claude Opus 4.5 is $2.50 per million tokens for batch input and $12.50 for batch output, significantly lower than previous versions [9][10] Advanced Tool Use - Claude Opus 4.5 features enhanced advanced tool use capabilities, allowing it to select tools, write automation scripts, and understand tool usage effectively, which is integrated into the Claude developer platform [23][31] - The introduction of Claude for Excel allows for efficient data processing without overwhelming the model with raw data [26][28] User Feedback - Users have reported that Claude Opus 4.5 can genuinely understand user needs, completing tasks that were challenging for earlier models like Sonnet 4.5 [15][16]