Workflow
Anthropic发布最强编码模型Claude Sonnet 4.5,可自主编码30小时
3 6 Ke·2025-09-30 09:17

Core Insights - Anthropic has launched its next-generation AI model, Claude Sonnet 4.5, claiming it to be the most advanced and secure coding and complex software intelligent agent model globally [3][4] - The model can autonomously run for 30 hours, significantly improving from its predecessor's 7-hour capability, and has been upgraded with new developer tools [4][8] Performance Metrics - Claude Sonnet 4.5 achieved an 82.0% score in the SWE-bench Verified benchmark, outperforming its predecessors and competitors like OpenAI's GPT-5 and Google's Gemini [6][7] - In the OSWorld test, it scored 61.4%, a notable increase from the 42.2% score of Sonnet 4 [7][8] Developer Ecosystem Enhancements - Anthropic has expanded its developer ecosystem with tools like the Claude Code, which now includes a checkpoints feature for automatic code state saving, and a native VS Code extension for seamless integration [10] - The introduction of advanced management tools, such as "context editing" and "memory tools," has improved task performance by 39% and reduced token consumption by 84% [10] Security and Alignment Improvements - Claude Sonnet 4.5 is described as the most aligned model to date, with extensive safety training that reduces the occurrence of harmful behaviors [11] - The model has been released under the ASL-3 framework, incorporating filters to detect and prevent the generation of potentially dangerous outputs, particularly in sensitive areas [11]