Workflow
刚刚,阿里最强编程模型开源,4800亿参数,Agent分数碾Kimi K2,训练细节公开
3 6 Ke·2025-07-22 23:53

Core Insights - Alibaba's Qwen team has released its latest flagship programming model, Qwen3-Coder-480B-A35B-Instruct, which is claimed to be the most powerful open-source programming model to date, featuring 480 billion parameters and supporting up to 1 million tokens in context [1][2][16] - The model has achieved state-of-the-art performance in various programming and agent tasks, surpassing other open-source models and even competing with proprietary models like GPT-4.1 [1][3][20] - Qwen3-Coder is designed to significantly enhance productivity, allowing novice programmers to accomplish tasks in a fraction of the time it would take experienced developers [2][24] Model Specifications - Qwen3-Coder offers multiple sizes, with the current release being the most powerful variant at 480 billion parameters, which is greater than Alibaba's previous flagship model Qwen3 at 235 billion parameters but less than Kimi K2 at 1 trillion parameters [2][3] - The model supports a native context of 256K tokens and can be extended to 1 million tokens, optimized for programming tasks [16][20] Performance Metrics - In benchmark tests, Qwen3-Coder has outperformed other models in categories such as Agentic Coding, Agentic Browser Use, and Agentic Tool Use, achieving the best performance among open-source models [1][3][20] - Specific performance metrics include scores in various benchmarks, such as 69.6 in SWE-bench Verified and 77.5 in TAU-Bench Retail, showcasing its capabilities in real-world programming tasks [3][20] Pricing Structure - The API for Qwen3-Coder is available on Alibaba Cloud's platform with a tiered pricing model based on input token volume, ranging from $1 to $6 per million tokens for input and $5 to $60 for output, depending on the token range [4][5][24] - The pricing is competitive compared to other models like Claude Sonnet 4, which has lower input and output costs [4][5] User Experience and Applications - Qwen3-Coder has been made available for free on the Qwen Chat web platform, allowing users to experience its capabilities firsthand [6][24] - Users have reported impressive results in various tasks, including game development and UI design, with the model demonstrating high completion rates and aesthetic quality [9][11][12] Future Developments - The Qwen team is actively working on enhancing the model's performance and exploring self-improvement capabilities for coding agents [24] - More model sizes are expected to be released, aiming to balance deployment costs and performance [24]