Workflow
Claude Agent SDK
icon
Search documents
加量不加价,一篇说明白 Claude Sonnet 4.5 强在哪
Founder Park· 2025-09-30 03:46
Core Viewpoint - Anthropic has launched the Claude Sonnet 4.5 model, claiming it to be the best coding model in the world, with a focus duration of over 30 hours for complex multi-step tasks, surpassing OpenAI's GPT-5 Codex [2][9]. Pricing and Cost Efficiency - The pricing for Claude Sonnet 4.5 remains the same as its predecessor, at $3 per million tokens for input and $15 per million tokens for output. Cost savings of up to 90% can be achieved through prompt caching, and batch processing can save 50% [2]. Developer Tools and Integration - Anthropic has introduced the Claude Agent SDK and an experimental feature called "Imagine with Claude" for developers, allowing integration with platforms like Amazon Bedrock and Google Cloud's Vertex AI [3][26]. Performance Metrics - In the SWE-bench Verified evaluation, Claude Sonnet 4.5 achieved industry-leading scores, with a 61.4% score in the OSWorld benchmark, significantly improving from the previous model's 42.2% [10][12]. Enhanced Features - The model includes new features such as a checkpoint function in Claude Code, context editing, and memory tools, enabling it to handle longer tasks and more complex operations [4][24]. Application and Usability - Users can interact with Claude Sonnet 4.5 through the Claude.ai website and mobile applications, with integrated functionalities for code execution and file creation directly within conversations [5][6]. Safety and Alignment - Claude Sonnet 4.5 is noted for its improved alignment and safety features, reducing undesirable behaviors such as deception and flattery, and making significant progress in defending against prompt injection attacks [24][25]. Experimental Features - The "Imagine with Claude" feature allows real-time software generation, showcasing the model's capabilities in adapting to user requests without pre-written code [31][33]. Recommendations - Anthropic recommends all users upgrade to Claude Sonnet 4.5 for enhanced performance across all applications, with updates available for both the Claude Code and developer platform [34].
Anthropic 深夜祭出 Claude Sonnet 4.5,能自主连续工作 30 小时,CEO:它更像你的同事
3 6 Ke· 2025-09-30 03:20
Core Insights - Anthropic has launched its new AI model, Claude Sonnet 4.5, claiming it to be the best coding model and a powerful tool for building complex agents, capable of independently completing production-level development tasks [1][10] - The model has shown significant improvements in software coding capabilities, achieving a 77.2% accuracy in the SWE-bench Verified benchmark, which is nearly a 20 percentage point increase from its predecessor [2][5] - Claude Sonnet 4.5 can autonomously run for 30 hours, generating 11,000 lines of code and completing a full development cycle for an enterprise chat application [2] Performance Metrics - The model's OSWorld benchmark score improved from 42.2% to 61.4% over four months, outperforming similar products in the industry [4][5] - In specialized fields like finance and law, the model's reasoning capabilities have improved by over 30% compared to the previous version, Opus 4.1 [4][5] - Claude Sonnet 4.5 achieved a perfect score of 100% in high school math competitions and 89.1% in multilingual Q&A tasks [5] Product Ecosystem Upgrades - Anthropic has introduced several product updates, including Claude Code 2.0, which features a "checkpoint" function for code progress saving and instant rollback, enhancing developer efficiency [8] - The API capabilities have been strengthened, extending the AI agent's runtime from 7 hours to 30 hours for more complex tasks [8] - A new browser extension, Claude for Chrome, has been made available for Max subscription users, integrating code execution and document creation directly within the application [8] Developer Empowerment - The release of the Claude Agent SDK allows developers to build customized AI assistants, addressing key challenges in AI agent development such as long-term task memory management and multi-agent coordination [9] - This SDK has already been validated by engineering teams at companies like Canva, improving codebase management and product research efficiency [9] Safety and Compliance - Claude Sonnet 4.5 has achieved AI Safety Level 3 (ASL-3) certification, significantly reducing the false positive rate by 90% compared to earlier models [10] - The model includes advanced content detection for hazardous materials and has made notable progress in defending against immediate injection attacks, a significant risk for users [10] Commercial Strategy - Anthropic maintains competitive pricing for API calls, consistent with the previous model, at $3 per million tokens for input and $15 for output [13] - The company positions Claude Sonnet 4.5 as the default choice for users, while still allowing access to older models for specific workflows [13] - Analysts suggest that the launch of Claude Sonnet 4.5 signifies a shift from AI as an "assistive tool" to "independent productivity," with the open SDK potentially accelerating AI agent technology adoption across industries [13][14]
刚刚,Claude Sonnet 4.5重磅发布,编程新王降临
3 6 Ke· 2025-09-30 01:32
Core Insights - Anthropic has officially released Claude Sonnet 4.5, which is defined as the world's strongest code model, showcasing significant breakthroughs in agent construction, computer usage, reasoning, and mathematical capabilities [2][3]. Performance and Benchmarking - Sonnet 4.5 achieved top performance in various authoritative tests, including a 77.2% score in SWE-bench Verified for real software coding capabilities, and a 61.4% score in OSWorld for simulating real computer tasks, up from 42.2% in the previous version [4][10][13]. - The model demonstrated a 100% success rate in high school math competitions and improved performance in graduate-level reasoning and multilingual Q&A [4][10]. New Features and Product Upgrades - The release includes significant updates across the Claude product line, such as the introduction of "Checkpoints" in Claude Code, allowing users to save progress and revert to earlier states [6]. - Claude API has added context editing features and memory tools, enabling agents to run longer and handle more complex tasks [6][34]. Developer Resources - A new core resource, Claude Agent SDK, has been introduced, providing foundational capabilities for building intelligent agents [8][9]. - The SDK is designed to support a wide range of applications beyond coding, facilitating the development of autonomous agents for complex tasks [32]. Safety and Alignment - Sonnet 4.5 is noted for its improved alignment and safety features, significantly reducing harmful behaviors and enhancing defenses against prompt injection attacks [28][31]. - The model is released under the AI Safety Level 3 framework, incorporating various protective measures, including classifiers for sensitive content [31]. Pricing and Access - The pricing for Sonnet 4.5 remains consistent with Sonnet 4, set at $3 per million tokens for input and $15 per million tokens for output [35]. - The model is accessible through multiple channels, including Claude API, Amazon Bedrock, and Google Cloud Vertex AI [37]. Industry Impact - Claude Sonnet 4.5 is positioned as a powerful tool for developers and professionals in fields such as finance, medicine, and research, marking a significant advancement in AI capabilities and safety [40].
Anthropic 深夜祭出 Claude Sonnet 4.5,能自主连续工作 30 小时!CEO:它更像你的同事
AI前线· 2025-09-30 01:18
Core Insights - Anthropic has launched its new AI model, Claude Sonnet 4.5, claiming it to be the best coding model and a powerful tool for building complex agents, capable of independent production-level development tasks [2][21] - The model shows significant improvements in software coding capabilities, achieving a 77.2% accuracy in SWE-bench Verified benchmark tests, which is nearly a 20 percentage point increase from its predecessor [4][9] - The release includes the Claude Agent SDK, which allows developers to create customized AI assistants, addressing key pain points in AI agent development [12][14] Performance Improvements - Claude Sonnet 4.5 has demonstrated a remarkable ability to autonomously run for 30 hours, generating 11,000 lines of code and completing a full enterprise chat application development process [4] - In the OSWorld benchmark, the model's score improved from 42.2% to 61.4% over four months, outperforming similar products in the industry [7][9] - The model has shown over 30% improvement in reasoning capabilities in specialized fields such as finance and law compared to the previous version, Opus 4.1 [7][9] Product Ecosystem Upgrades - The Claude Agent SDK enables developers to build tailored AI assistants for various applications, including project management and customer service [12][14] - Claude Code 2.0 introduces a highly requested "checkpoint" feature for code progress saving and instant rollback, enhancing development efficiency [13] - The API capabilities have been strengthened, extending the AI agent's operational time from 7 hours to 30 hours for more complex tasks [13] Safety and Security Enhancements - Claude Sonnet 4.5 has achieved AI Safety Level 3 (ASL-3) certification, significantly reducing the false positive rate by 90% compared to earlier models [16] - The model includes advanced detection for hazardous content and has made substantial progress in defending against immediate injection attacks, a major risk for users [16] Commercial Strategy - Anthropic maintains competitive pricing for API calls, consistent with Claude Sonnet 4, at $3 per million tokens for input and $15 for output [19] - The company positions Claude Sonnet 4.5 as the default choice for users, recommending it for nearly all use cases while still allowing access to older models for specific workflows [19][20] - Industry analysts note that the release signifies a shift from AI as an "assistive tool" to "independent productivity" [21]
Claude Sonnet 4.5被炸出来了,依旧最强编程,连续30小时自主运行写代码
量子位· 2025-09-30 00:57
克雷西 发自 凹非寺 量子位 | 公众号 QbitAI 最强编程模型让位了。 但没有换人,依然是Claude。 新发布的 Claude Sonnet 4.5 ,在SWE-bench上的成绩比Sonnet 4提升了1.8个百分点,而且提质不加价。 而且有第三方表示,Claude Sonnet 4.5 能一口气工作30个小时,完全自主地编写代码 。 在这30个小时里,Claude Sonnet 4.5写了11000多行代码,构建出了类似Slack的聊天应用。 此前Opus 4曾因为连续工作7小时就备受关注,现在这个数字直接变成了4倍多。 计算机操作方面,Claude Sonnet 4.5在OSWorld测试中取得了60.2分的SOTA成绩,比Sonnet 4提升了近一半。 总之,Claude Sonnet 4.5在多项领域都实现了对自己的超越,成为该领域内的最佳模型。 先有昨晚的DeepSeek-V3.2,紧接着又是Claude Sonnet 4.5,赶在节前密集上新的模型,看来是真的不让人放假了。(手动狗头) 多项指标超越自我 来看Anthropic晒出的Claude Sonnet 4.5成绩单。 除了已经介绍 ...
Claude Sonnet 4.5来了!能连续编程30多小时、1.1万行代码
机器之心· 2025-09-30 00:27
Core Insights - The article discusses the recent advancements in AI models, particularly the release of Claude Sonnet 4.5 by Anthropic, which is positioned as a leading model in various benchmarks and applications [1][4][5]. Model Performance - Claude Sonnet 4.5 achieved significant performance improvements in various benchmarks, including: - 77.2% in Agentic coding [2] - 82.0% in SWE-bench Verified [2] - 61.4% in OSWorld for computer use, up from 42.2% in the previous version [11] - The model shows enhanced capabilities in reasoning and mathematics, with a perfect score of 100% in high school math competitions [12][13]. Developer Tools and Features - Anthropic introduced the Claude Agent SDK, allowing developers to create their own intelligent agents [4][35]. - New features include checkpoint functionality for saving progress, a revamped terminal interface, and native VS Code extensions [8][4]. Safety and Alignment - Claude Sonnet 4.5 is noted for being the most aligned model to human values, with improvements in reducing undesirable behaviors such as flattery and deception [27][5]. - The model is released under AI safety level 3 (ASL-3), incorporating classifiers to detect potentially dangerous inputs and outputs [32]. User Experience and Applications - Early user experiences indicate that Claude Sonnet 4.5 performs exceptionally well in specialized fields such as finance, law, and STEM [13][21]. - The "Imagine with Claude" feature allows real-time software generation without pre-defined functions, showcasing the model's adaptability [36][38].
Anthropic推出Claude Sonnet 4.5,号称 “全球最佳编码模型”
Hua Er Jie Jian Wen· 2025-09-29 20:57
Core Insights - Anthropic has launched its latest AI model, Claude Sonnet 4.5, which is claimed to be the "best coding model in the world" based on industry benchmarks like SWE-bench Verified [1][4] - The new model shows significant improvements in code generation quality, code improvement identification, and instruction adherence compared to previous models [1][4] - Experts in finance, law, and medicine have noted enhanced knowledge and reasoning capabilities in Sonnet 4.5 over older models like Opus 4.1 [1] Performance and Features - Claude Sonnet 4.5 has achieved a score of 61.4% in the OSWorld benchmark, up from 42.2% four months ago, indicating a substantial performance leap [4] - The model is designed to autonomously run for up to 30 hours, significantly longer than the 7 hours of its predecessor [6] - Initial user feedback suggests that while the model outputs are generally better, it may occasionally miss key details [6] Safety and Alignment - The model is described as the most consistent version to date, with improvements in behavior and a reduction in concerning actions such as deception and power-seeking [7] - It has enhanced resistance to prompt injection attacks, which can lead to malicious operations [7] - Released under AI Safety Level 3 (ASL-3), it includes classifiers for detecting threats related to chemical, biological, radiological, and nuclear (CBRN) weapons [7] Product Updates - Alongside the new model, Anthropic introduced the Claude Agent SDK, aimed at helping developers build AI agents with improved memory management and autonomy [10] - Additional updates include a "checkpoint" feature for Claude Code, a new native extension for VS Code, and direct integration of code execution and file creation in paid applications [12] - The pricing for Sonnet 4.5 remains consistent with the previous generation, Sonnet 4, while paid subscribers can still opt for the older Opus model [3]