AI代码助手
Search documents
Claude Code“隐形技术栈”被扒出来了,2430次测试揭秘工具偏好清单
3 6 Ke· 2026-02-27 09:27
Core Insights - The study conducted by Amplifying.ai reveals Claude Code's preference for building custom solutions over recommending third-party tools, with 12% of all major selections being self-built solutions [5][27] - A default technology stack has emerged, with Claude Code favoring specific third-party tools such as Vercel, PostgreSQL, and Stripe [6][30] - Certain tool categories have become dominated by single tools, with GitHub Actions, Stripe, and shadcn/ui capturing 94%, 91%, and 90% of their respective categories [7][31] - Consistency in tool selection is high among different models within the same technology ecosystem, with 90% agreement on preferred tools across 20 categories [8][49] Experiment Setup - The research covered three models and involved 4 project types and 20 tool categories, analyzing a total of 2,430 tool selection behaviors [2][11] - Open-ended prompts were used throughout the experiment, ensuring no specific tool names were mentioned [4][13] - The study included a clean code environment for each test run to ensure unbiased results [11] Key Findings - Claude Code shows a strong inclination towards self-built solutions, particularly in feature flags and authentication, where it frequently opts for custom implementations [27][28] - The study identified a high extraction rate of 85.3%, indicating a strong ability to identify primary tool recommendations from responses [19] - The models demonstrated varying preferences, with Opus 4.6 showing a tendency to recommend newer tools and custom solutions compared to its predecessors [56] Tool Selection Preferences - GitHub Actions, Stripe, and shadcn/ui are the most frequently recommended tools, dominating their respective categories with high selection rates [30][31] - The study highlights that project context significantly influences tool selection, with different models showing consistent preferences within the same technology ecosystem [9][62] - The research indicates a trend towards custom solutions, particularly in areas like feature flags and authentication, where models prefer building from scratch rather than using established services [47][39] Model Comparison - The three models (Sonnet 4.5, Opus 4.5, Opus 4.6) showed high agreement on tool preferences, with only a few categories exhibiting real divergence [49][50] - The study emphasizes that the choice of tools is heavily influenced by the specific programming ecosystem, with distinct preferences emerging for JavaScript and Python projects [61][62] - The models' recommendations reflect a shift towards a new technology stack shaped by AI-assisted development, indicating a potential change in industry standards [62]
xAI 推出代码专用模型:256K上下文,速度更快,限时免费
Founder Park· 2025-08-29 02:53
Core Viewpoint - The article discusses the launch of Grok Code Fast 1 by xAI, highlighting its speed, cost-effectiveness, and advanced capabilities in programming tasks, positioning it as a competitive alternative to existing models like Claude Sonnet 4 and GPT-5 [2][24]. Group 1: Model Features - Grok Code Fast 1 is designed for rapid and economical programming, supporting a context of 256K and available for free for the first 7 days [2]. - The model ranks 5th on ToyBench, following GPT-5, Claude Opus 4, Gemini 2.5 Pro, and DeepSeek Reasoner [2]. - It boasts a command cache hit rate exceeding 90%, ensuring a smooth user experience with minimal lag [16]. Group 2: Performance and Usability - The model is built from scratch using a specialized code corpus for pre-training and fine-tuned with real-world coding tasks [15]. - It can handle various programming languages including TypeScript, Python, Java, Rust, C++, and Go, performing tasks from project creation to bug fixing without human supervision [16]. - In internal benchmarks, Grok Code Fast 1 achieved a score of 70.8% on the SWE-Bench-Verified subset, indicating strong performance among programming models [18]. Group 3: Pricing and Comparison - The pricing for Grok Code Fast 1 is significantly lower than its competitors, costing $0.20 per million input tokens and $1.50 for output tokens, making it approximately one-tenth the price of Claude Sonnet 4 and GPT-5 [24][25]. - It is particularly suited for complex automation tasks that require multiple steps and tool calls, contrasting with Grok 4, which is better for single-query scenarios [23].