大模型开源
Search documents
林俊旸提出离职后,阿里高管紧急答疑丨36氪独家
36氪· 2026-03-04 14:40
Core Insights - The article discusses the recent upheaval within Alibaba's AI division, particularly the departure of key figure Lin Junyang, which has caused significant concern among the Qwen team and the broader AI community [6][12][22] - Alibaba's leadership emphasized that the restructuring of the Qwen team is not a contraction but an expansion, aimed at enhancing resources and talent allocation [8][9] - The article highlights the competitive landscape of AI, where Alibaba aims to maintain its lead in open-source models while also striving to catch up in proprietary flagship models [21][23] Group 1: Team Restructuring and Leadership Changes - Lin Junyang, a pivotal leader in Alibaba's AI efforts, announced his departure, leading to uncertainty within the Qwen team [6][12] - The All Hands meeting revealed that the leadership is committed to expanding the Qwen team and addressing resource allocation issues, despite internal challenges [8][9] - New leadership roles are being discussed, with Hao Zhou from Google DeepMind expected to take over Lin's responsibilities [16][18] Group 2: Strategic Direction and Resource Allocation - Alibaba's AI strategy has recently undergone significant changes, with a focus on integrating various model modalities to enhance training efficiency [11][22] - The Qwen family has released over 400 models since 2023, showcasing a wide range of parameter scales, but the team remains small compared to competitors like ByteDance [23] - The article notes that while Alibaba has gained a strong reputation in open-source, it faces challenges in scaling its proprietary models to keep pace with rivals [21][23] Group 3: Community and Market Impact - Lin's departure has sparked reactions from the AI community, with many expressing gratitude for his contributions to the Qwen project [15][22] - The article suggests that the loss of key personnel could delay the development of Qwen models by six months to a year, impacting Alibaba's competitive position [15][22] - The ongoing adjustments within Alibaba's AI division reflect broader trends in the industry, where rapid changes in strategy and resource allocation are critical for success [22][23]
阶跃星辰全面开源 Step 3.5 Flash:OpenClaw 调用量飙升至 Top2
IPO早知道· 2026-03-04 05:19
Core Insights - The article discusses the open-sourcing of the Agent base model and its pre-training weights by Jieyue Xingchen, following the release of the Step 3.5 Flash model, which has generated significant interest in the open-source community [2]. Group 1 - Jieyue Xingchen has released the pre-training weights (Base), mid-training weights (Midtrain), and the accompanying Steptron training framework for its Agent base model, marking a comprehensive move in a conservative open-source environment [2]. - The Step 3.5 Flash model utilizes a sparse MoE architecture with a total of 196 billion parameters, activating only about 11 billion parameters during inference, achieving a maximum inference speed of 350 TPS for single request code tasks [2]. - This model is specifically designed for agent scenarios and excels in complex reasoning and long-chain tasks, with its inference depth comparable to some top-tier closed-source models [2]. Group 2 - The Step 3.5 Flash model has gained market validation in developer communities and practical applications, with over 300,000 downloads on Hugging Face and ranking first on OpenRouter Trending, indicating high community recognition [3]. - On the well-known open-source project OpenClaw, the model has risen to the top two rankings, reflecting its competitive edge in speed, stability, and agent adaptability [3]. - The ongoing popularity of platforms like OpenClaw may further accelerate the penetration of Chinese models in the global agent ecosystem due to the open-sourcing of Step 3.5 Flash [5].
阿里千问3.5三款中等规模模型开源:性能不再依赖参数堆叠
Feng Huang Wang· 2026-02-25 07:49
Core Viewpoint - Qwen 3.5 models have been officially open-sourced, showcasing advancements in performance that surpass previous larger models, indicating a shift in AI development focus from sheer parameter size to architecture optimization and data quality enhancement [1] Group 1: Model Performance - Qwen3.5-35B-A3B outperforms its predecessor Qwen3-235B-A22B-2507 and Qwen3-VL-235B-A22B [1] - Qwen3.5-122B-A10B and Qwen3.5-27B further narrow the performance gap between medium-scale models and cutting-edge models, particularly excelling in complex agent scenarios [1] Group 2: Development Approach - The advancements in Qwen 3.5 models highlight that performance is no longer solely dependent on increasing parameters but is significantly driven by architectural optimization, improved data quality, and reinforcement learning [1]
Qwen3.5开源家族扩容
Cai Jing Wang· 2026-02-25 07:04
Core Insights - The company has further opened source models Qwen3.5-122B-A10B, Qwen3.5-35B-A3B, and Qwen3.5-27B(Dense) following the initial open-source release of flagship model Qwen3.5-397B-A17B [1] - The Qwen3.5-Flash API has officially launched on Alibaba Cloud [1]
阿里发布三款中型千问3.5新模型
Mei Ri Jing Ji Xin Wen· 2026-02-25 06:50
Core Insights - Alibaba continues to open-source its Qwen 3.5 series models, enhancing its AI capabilities and offerings in the market [1] Group 1: Model Details - The latest open-sourced models include Qwen3.5-35B-A3B, Qwen3.5-122B-A10B, and Qwen3.5-27B, which are of medium scale [1] - Qwen3.5-Flash is now available on Alibaba Cloud's Bai Lian platform, indicating a strategic move to integrate these models into cloud services [1] Group 2: Pricing Information - The cost for inputting per million tokens is as low as 0.2 yuan, showcasing competitive pricing in the AI model market [1]
千问大模型:Qwen3.5-Flash来袭,三款中等规模模型全开源
Xin Lang Cai Jing· 2026-02-25 06:44
Core Insights - Qwen 3.5 models have been officially open-sourced, including Qwen3.5-35B-A3B, Qwen3.5-122B-A10B, and Qwen3.5-27B, showcasing significant performance improvements over previous larger models [1][2][12] Model Performance - Qwen3.5-35B-A3B outperforms the previous larger models Qwen3-235B-A22B-2507 and Qwen3-VL-235B-A22B, indicating that performance is now driven by architectural optimization, data quality enhancement, and reinforcement learning rather than just parameter scaling [1][3][13] - The Qwen3.5-122B-A10B and Qwen3.5-27B further narrow the performance gap between medium-scale models and cutting-edge models, particularly excelling in complex agent scenarios [1][3][13] Architectural Innovations - Qwen3.5 employs a hybrid attention mechanism combined with a highly sparse MoE architecture, trained on a larger scale of mixed text and visual tokens, achieving greater performance with fewer total and active parameters [3][10][15] - The new models have surpassed larger models in various authoritative benchmarks, including IFBench, GPQA, HMMT 25, MMMLU, BFCL v4, and SWE-bench Verified [3][10][15] Developer Accessibility - The Qwen3.5-27B model is designed for local deployment, featuring enhanced agent capabilities and native multimodal abilities, outperforming GPT-5 mini in multiple agent evaluations [4][16] - Qwen3.5-Flash API service is available on Alibaba Cloud, priced at 0.2 yuan per million tokens, offering high performance and cost-effectiveness for developers and enterprises [5][17] Community Support - All three models are available on platforms like Magic搭 and Hugging Face, along with the open-sourced Qwen3.5-35B-A3B-Base model to support community research, fine-tuning, and secondary development [7][19]
阿里千问宣布Qwen3.5开源家族扩容
Di Yi Cai Jing· 2026-02-25 02:15
Core Viewpoint - The company has announced the further open-sourcing of its Qwen 3.5 models, expanding its offerings in the AI space following the initial release of the flagship model Qwen 3.5-397B-A17B [1] Group 1 - The newly open-sourced models include Qwen 3.5-122B-A10B, Qwen 3.5-35B-A3B, and Qwen 3.5-27B (Dense) [1] - The Qwen 3.5-Flash API has officially launched on Alibaba Cloud [1]
以小胜大高性价比,千问春节档真正的杀手锏来了
新浪财经· 2026-02-17 05:14
Core Viewpoint - Alibaba has launched the Qwen3.5-Plus model, which is positioned as the strongest open-source model globally, outperforming top closed-source models like Gemini-3-pro and GPT-5.2, while offering a significantly lower API price of 0.8 yuan per million tokens, which is only 1/18 of Gemini-3-pro's cost [2][8]. Group 1: Technical Innovations - The Qwen3.5-Plus model features a total of 397 billion parameters, with only 17 billion activated during inference, allowing it to utilize less than 5% of the computational power while accessing its full knowledge base [5]. - The introduction of a mixed attention mechanism enables the model to allocate attention resources dynamically based on the importance of information, optimizing computational efficiency [4]. - The model has transitioned from a pure text model to a native multimodal model, significantly enhancing its capabilities in reasoning, programming, and various assessments, surpassing some benchmarks of Gemini 3 Pro and GPT-5.2 [5]. Group 2: Business Logic Behind Open Source - The efficiency achieved through architectural innovation allows Qwen3.5 to be both powerful and cost-effective, making advanced AI capabilities accessible to individual developers, startups, and small enterprises [7]. - Since its open-source launch in 2023, Alibaba has released over 400 Qwen models, achieving over 1 billion downloads, and has become the leading open-source model recognized for its developer-friendly approach [7]. - The Qwen model continues to evolve, now supporting 201 languages and expanding its vocabulary size from 150,000 to 250,000, which can enhance encoding efficiency for less common languages by up to 60% [7]. Group 3: Market Position and Growth - According to Omdia, Alibaba Cloud's market share in China's cloud market increased from 34% to 36% in Q3 2025, maintaining its position as the market leader [9]. - AI has become a major driver of new demand for cloud infrastructure services, with Alibaba Cloud's AI-related product revenue experiencing triple-digit year-on-year growth for nine consecutive quarters [9]. Group 4: Agent Capabilities - The Qwen3.5 model has achieved breakthroughs in agent applications, enabling it to autonomously operate mobile and computer tasks, significantly improving operational efficiency [11]. - The Qwen App has launched the world's first consumer-grade AI shopping agent, which successfully processed 120 million orders in just six days during the Spring Festival, demonstrating its commercial viability [11]. - Developers can access the new Qwen3.5-Plus model through platforms like HuggingFace and Alibaba Cloud, with plans for further releases of different model sizes and functionalities [11].
阿里除夕夜将开源新一代千问Qwen3.5模型
Di Yi Cai Jing· 2026-02-16 02:13
Core Insights - Alibaba is set to open source its next-generation Qwen 3.5 model on New Year's Eve, marking a significant innovation in model architecture [1] Group 1 - The new Qwen 3.5 model represents a comprehensive innovation in its architecture [1]
鏖战2025年,大模型围着开源转
3 6 Ke· 2025-12-25 10:29
Core Viewpoint - By 2025, open-source will dominate the landscape of large models, with a significant increase in the number of users adopting open-source models globally, marking a shift in the competitive dynamics between open and closed-source approaches [1][20]. Group 1: Open-Source vs Closed-Source Dynamics - The debate between open-source and closed-source large models has been ongoing, with both sides presenting strong arguments, but the trend is shifting towards open-source as more major internet companies adopt this approach [1][5]. - Closed-source models, initially seen as the only viable path due to advantages in data security and commercial monetization, are now facing challenges in areas like AI accessibility and ecosystem development [3][10]. - The emergence of open-source models has created a new competitive landscape, with companies like Meta and Alibaba leading the charge in open-source initiatives [5][10]. Group 2: Impact of DeepSeek - The introduction of DeepSeek has significantly altered the competitive balance, demonstrating that open-source models can achieve high performance at lower costs, thus attracting more companies to switch to open-source strategies [7][20]. - DeepSeek's training cost was approximately $294,000, with a training duration of about 80 hours, showcasing a more efficient approach compared to traditional methods [7]. - Open-source models like DeepSeek and Qwen have reportedly matched or even surpassed the performance of leading international products, shifting the focus of competition from pure performance to cost, efficiency, and commercialization capabilities [8][20]. Group 3: Market Trends and User Engagement - The AI application market is rapidly evolving, with mobile and PC active user numbers reaching 729 million and 200 million respectively by September 2025, indicating a shift towards more specialized and efficient applications [11][13]. - Open-source models are seen as the quickest path to market, fostering a collaborative ecosystem that enhances user engagement and accelerates innovation [13][14]. - Companies are increasingly recognizing the long-term commercial value of high user engagement within open-source ecosystems, leading to a competitive race among internet giants to provide comprehensive open-source solutions [15][19]. Group 4: Commercialization of Open-Source - Open-source does not equate to free; companies are exploring various monetization strategies, including enterprise versions, commercial APIs, and cloud services, to sustain their open-source initiatives [18][19]. - Alibaba has open-sourced over 300 models, generating more than 170,000 derivative models, positioning itself as a leader in the global open-source model landscape [16]. - Baidu is integrating its self-developed Kunlun chips with open-source models, adopting a full-stack autonomous approach to enhance its competitive edge [17].