Workflow
混合推理模型
icon
Search documents
阿里开源千问3模型 成本仅需DeepSeek-R1三分之一
Core Insights - Alibaba has open-sourced over 200 models, achieving a global download count exceeding 300 million, with over 100,000 derivative models of Qwen [6] - The newly released Qwen3 model features a parameter count of 235 billion, significantly reducing costs while outperforming leading models like DeepSeek-R1 and OpenAI-o1 [1][2] Performance Enhancements - Qwen3 has shown substantial improvements in reasoning, instruction adherence, tool invocation, and multilingual capabilities, setting new performance records among domestic and global open-source models [2] - In the AIME25 evaluation, Qwen3 scored 81.5, surpassing previous open-source records, and achieved over 70 points in the LiveCodeBench assessment, outperforming Grok3 [2][3] Model Architecture - Qwen3 employs a mixed expert (MoE) architecture, activating only 22 billion parameters out of 235 billion, which allows for efficient performance with reduced computational costs [1][2] - The model offers various versions, including 30B and 235B MoE models, as well as dense models ranging from 0.6B to 32B, all achieving state-of-the-art performance for their sizes [4] Application and Accessibility - Qwen3 supports the upcoming surge in intelligent agents and large model applications, with a BFCL evaluation score of 70.8, surpassing top models like Gemini2.5-Pro and OpenAI-o1 [5] - The model is open-sourced under the Apache 2.0 license, supporting over 119 languages, and is available for free download on platforms like HuggingFace and Alibaba Cloud [5][6]