Workflow
Claude Sonnet 4.5来了!能连续编程30多小时、1.1万行代码
机器之心·2025-09-30 00:27

Core Insights - The article discusses the recent advancements in AI models, particularly the release of Claude Sonnet 4.5 by Anthropic, which is positioned as a leading model in various benchmarks and applications [1][4][5]. Model Performance - Claude Sonnet 4.5 achieved significant performance improvements in various benchmarks, including: - 77.2% in Agentic coding [2] - 82.0% in SWE-bench Verified [2] - 61.4% in OSWorld for computer use, up from 42.2% in the previous version [11] - The model shows enhanced capabilities in reasoning and mathematics, with a perfect score of 100% in high school math competitions [12][13]. Developer Tools and Features - Anthropic introduced the Claude Agent SDK, allowing developers to create their own intelligent agents [4][35]. - New features include checkpoint functionality for saving progress, a revamped terminal interface, and native VS Code extensions [8][4]. Safety and Alignment - Claude Sonnet 4.5 is noted for being the most aligned model to human values, with improvements in reducing undesirable behaviors such as flattery and deception [27][5]. - The model is released under AI safety level 3 (ASL-3), incorporating classifiers to detect potentially dangerous inputs and outputs [32]. User Experience and Applications - Early user experiences indicate that Claude Sonnet 4.5 performs exceptionally well in specialized fields such as finance, law, and STEM [13][21]. - The "Imagine with Claude" feature allows real-time software generation without pre-defined functions, showcasing the model's adaptability [36][38].