Workflow
大模型开源
icon
Search documents
林俊旸提出离职后,阿里高管紧急答疑丨36氪独家
36氪· 2026-03-04 14:40
Core Insights - The article discusses the recent upheaval within Alibaba's AI division, particularly the departure of key figure Lin Junyang, which has caused significant concern among the Qwen team and the broader AI community [6][12][22] - Alibaba's leadership emphasized that the restructuring of the Qwen team is not a contraction but an expansion, aimed at enhancing resources and talent allocation [8][9] - The article highlights the competitive landscape of AI, where Alibaba aims to maintain its lead in open-source models while also striving to catch up in proprietary flagship models [21][23] Group 1: Team Restructuring and Leadership Changes - Lin Junyang, a pivotal leader in Alibaba's AI efforts, announced his departure, leading to uncertainty within the Qwen team [6][12] - The All Hands meeting revealed that the leadership is committed to expanding the Qwen team and addressing resource allocation issues, despite internal challenges [8][9] - New leadership roles are being discussed, with Hao Zhou from Google DeepMind expected to take over Lin's responsibilities [16][18] Group 2: Strategic Direction and Resource Allocation - Alibaba's AI strategy has recently undergone significant changes, with a focus on integrating various model modalities to enhance training efficiency [11][22] - The Qwen family has released over 400 models since 2023, showcasing a wide range of parameter scales, but the team remains small compared to competitors like ByteDance [23] - The article notes that while Alibaba has gained a strong reputation in open-source, it faces challenges in scaling its proprietary models to keep pace with rivals [21][23] Group 3: Community and Market Impact - Lin's departure has sparked reactions from the AI community, with many expressing gratitude for his contributions to the Qwen project [15][22] - The article suggests that the loss of key personnel could delay the development of Qwen models by six months to a year, impacting Alibaba's competitive position [15][22] - The ongoing adjustments within Alibaba's AI division reflect broader trends in the industry, where rapid changes in strategy and resource allocation are critical for success [22][23]
阶跃星辰全面开源 Step 3.5 Flash:OpenClaw 调用量飙升至 Top2
IPO早知道· 2026-03-04 05:19
本文为IPO早知道原创 作者| Stone Jin 微信公众号|ipozaozhidao 据IPO早知道消息,阶跃星辰继开源 Step 3.5 Flash 模型后,日前又开源了这款 Agent 基座模型 的预训练权重(Base)、中训练权重(Midtrain)以及配套的 Steptron 训练框架。这一举动在当 前大模型开源趋于保守的环境下,显得颇为彻底,在开源社区引发热烈反响。 据了解,Step 3.5 Flash 采用稀疏 MoE 架构,总参数 1960 亿,但推理时仅激活约 110 亿参 数,单请求代码任务下推理速度最高可达 350 TPS。该模型专为智能体(Agent)场景设计,在复 杂推理和长链任务中表现出色,官方称其推理深度可媲美部分顶级闭源模型。 从预训练到框架统统开源。 在开发者社区和实际应用中,Step 3.5 Flash 已经迅速获得了市场验证。截至目前,这款模型在 Hugging Face 上下载量已超 30 万次,并登上 OpenRouter Trending 第一名,获得了较高的社 区认可度。而在知名开源项目 OpenClaw(被中国网友称为"小龙虾")上,该模型排名已升至前 二。这 ...
阿里千问3.5三款中等规模模型开源:性能不再依赖参数堆叠
Feng Huang Wang· 2026-02-25 07:49
官方介绍,Qwen3.5-35B-A3B的表现已超越前代更大规模模型Qwen3-235B-A22B-2507及Qwen3-VL- 235B-A22B,而Qwen3.5-122B-A10B与27B版本进一步缩小了中等规模模型与前沿模型的差距,尤其在 复杂代理场景中表现优异。这表明性能超越规模,不再单纯依赖参数堆叠,而是通过架构优化、数据质 量提升及强化学习推动智能发展。 凤凰网科技讯2月25日,千问大模型官方宣布,正式开源千问3.5最新中等规模模型:Qwen3.5-35B- A3B、Qwen3.5-122B-A10B、Qwen3.5-27B。 ...
Qwen3.5开源家族扩容
Cai Jing Wang· 2026-02-25 07:04
Core Insights - The company has further opened source models Qwen3.5-122B-A10B, Qwen3.5-35B-A3B, and Qwen3.5-27B(Dense) following the initial open-source release of flagship model Qwen3.5-397B-A17B [1] - The Qwen3.5-Flash API has officially launched on Alibaba Cloud [1]
阿里发布三款中型千问3.5新模型
Mei Ri Jing Ji Xin Wen· 2026-02-25 06:50
每经AI快讯,2月25日,阿里继续开源千问3.5系列模型。本次开源三款中等规模的新模型,包括 Qwen3.5-35B-A3B、Qwen3.5-122B-A10B和Qwen3.5-27B。目前,Qwen3.5-Flash已上线阿里云百炼,每 百万Token输入低至0.2元。 ...
千问大模型:Qwen3.5-Flash来袭,三款中等规模模型全开源
Xin Lang Cai Jing· 2026-02-25 06:44
Core Insights - Qwen 3.5 models have been officially open-sourced, including Qwen3.5-35B-A3B, Qwen3.5-122B-A10B, and Qwen3.5-27B, showcasing significant performance improvements over previous larger models [1][2][12] Model Performance - Qwen3.5-35B-A3B outperforms the previous larger models Qwen3-235B-A22B-2507 and Qwen3-VL-235B-A22B, indicating that performance is now driven by architectural optimization, data quality enhancement, and reinforcement learning rather than just parameter scaling [1][3][13] - The Qwen3.5-122B-A10B and Qwen3.5-27B further narrow the performance gap between medium-scale models and cutting-edge models, particularly excelling in complex agent scenarios [1][3][13] Architectural Innovations - Qwen3.5 employs a hybrid attention mechanism combined with a highly sparse MoE architecture, trained on a larger scale of mixed text and visual tokens, achieving greater performance with fewer total and active parameters [3][10][15] - The new models have surpassed larger models in various authoritative benchmarks, including IFBench, GPQA, HMMT 25, MMMLU, BFCL v4, and SWE-bench Verified [3][10][15] Developer Accessibility - The Qwen3.5-27B model is designed for local deployment, featuring enhanced agent capabilities and native multimodal abilities, outperforming GPT-5 mini in multiple agent evaluations [4][16] - Qwen3.5-Flash API service is available on Alibaba Cloud, priced at 0.2 yuan per million tokens, offering high performance and cost-effectiveness for developers and enterprises [5][17] Community Support - All three models are available on platforms like Magic搭 and Hugging Face, along with the open-sourced Qwen3.5-35B-A3B-Base model to support community research, fine-tuning, and secondary development [7][19]
阿里千问宣布Qwen3.5开源家族扩容
Di Yi Cai Jing· 2026-02-25 02:15
据通义实验室官微消息,继旗舰模型Qwen3.5-397B-A17B首次开源后,现进一步开源Qwen3.5-122B- A10B、Qwen3.5-35B-A3B、Qwen3.5-27B(Dense)。同时,Qwen3.5-Flash API已正式上线阿里云百 炼。 (文章来源:第一财经) ...
以小胜大高性价比,千问春节档真正的杀手锏来了
新浪财经· 2026-02-17 05:14
Core Viewpoint - Alibaba has launched the Qwen3.5-Plus model, which is positioned as the strongest open-source model globally, outperforming top closed-source models like Gemini-3-pro and GPT-5.2, while offering a significantly lower API price of 0.8 yuan per million tokens, which is only 1/18 of Gemini-3-pro's cost [2][8]. Group 1: Technical Innovations - The Qwen3.5-Plus model features a total of 397 billion parameters, with only 17 billion activated during inference, allowing it to utilize less than 5% of the computational power while accessing its full knowledge base [5]. - The introduction of a mixed attention mechanism enables the model to allocate attention resources dynamically based on the importance of information, optimizing computational efficiency [4]. - The model has transitioned from a pure text model to a native multimodal model, significantly enhancing its capabilities in reasoning, programming, and various assessments, surpassing some benchmarks of Gemini 3 Pro and GPT-5.2 [5]. Group 2: Business Logic Behind Open Source - The efficiency achieved through architectural innovation allows Qwen3.5 to be both powerful and cost-effective, making advanced AI capabilities accessible to individual developers, startups, and small enterprises [7]. - Since its open-source launch in 2023, Alibaba has released over 400 Qwen models, achieving over 1 billion downloads, and has become the leading open-source model recognized for its developer-friendly approach [7]. - The Qwen model continues to evolve, now supporting 201 languages and expanding its vocabulary size from 150,000 to 250,000, which can enhance encoding efficiency for less common languages by up to 60% [7]. Group 3: Market Position and Growth - According to Omdia, Alibaba Cloud's market share in China's cloud market increased from 34% to 36% in Q3 2025, maintaining its position as the market leader [9]. - AI has become a major driver of new demand for cloud infrastructure services, with Alibaba Cloud's AI-related product revenue experiencing triple-digit year-on-year growth for nine consecutive quarters [9]. Group 4: Agent Capabilities - The Qwen3.5 model has achieved breakthroughs in agent applications, enabling it to autonomously operate mobile and computer tasks, significantly improving operational efficiency [11]. - The Qwen App has launched the world's first consumer-grade AI shopping agent, which successfully processed 120 million orders in just six days during the Spring Festival, demonstrating its commercial viability [11]. - Developers can access the new Qwen3.5-Plus model through platforms like HuggingFace and Alibaba Cloud, with plans for further releases of different model sizes and functionalities [11].
阿里除夕夜将开源新一代千问Qwen3.5模型
Di Yi Cai Jing· 2026-02-16 02:13
Core Insights - Alibaba is set to open source its next-generation Qwen 3.5 model on New Year's Eve, marking a significant innovation in model architecture [1] Group 1 - The new Qwen 3.5 model represents a comprehensive innovation in its architecture [1]
鏖战2025年,大模型围着开源转
3 6 Ke· 2025-12-25 10:29
Core Viewpoint - By 2025, open-source will dominate the landscape of large models, with a significant increase in the number of users adopting open-source models globally, marking a shift in the competitive dynamics between open and closed-source approaches [1][20]. Group 1: Open-Source vs Closed-Source Dynamics - The debate between open-source and closed-source large models has been ongoing, with both sides presenting strong arguments, but the trend is shifting towards open-source as more major internet companies adopt this approach [1][5]. - Closed-source models, initially seen as the only viable path due to advantages in data security and commercial monetization, are now facing challenges in areas like AI accessibility and ecosystem development [3][10]. - The emergence of open-source models has created a new competitive landscape, with companies like Meta and Alibaba leading the charge in open-source initiatives [5][10]. Group 2: Impact of DeepSeek - The introduction of DeepSeek has significantly altered the competitive balance, demonstrating that open-source models can achieve high performance at lower costs, thus attracting more companies to switch to open-source strategies [7][20]. - DeepSeek's training cost was approximately $294,000, with a training duration of about 80 hours, showcasing a more efficient approach compared to traditional methods [7]. - Open-source models like DeepSeek and Qwen have reportedly matched or even surpassed the performance of leading international products, shifting the focus of competition from pure performance to cost, efficiency, and commercialization capabilities [8][20]. Group 3: Market Trends and User Engagement - The AI application market is rapidly evolving, with mobile and PC active user numbers reaching 729 million and 200 million respectively by September 2025, indicating a shift towards more specialized and efficient applications [11][13]. - Open-source models are seen as the quickest path to market, fostering a collaborative ecosystem that enhances user engagement and accelerates innovation [13][14]. - Companies are increasingly recognizing the long-term commercial value of high user engagement within open-source ecosystems, leading to a competitive race among internet giants to provide comprehensive open-source solutions [15][19]. Group 4: Commercialization of Open-Source - Open-source does not equate to free; companies are exploring various monetization strategies, including enterprise versions, commercial APIs, and cloud services, to sustain their open-source initiatives [18][19]. - Alibaba has open-sourced over 300 models, generating more than 170,000 derivative models, positioning itself as a leader in the global open-source model landscape [16]. - Baidu is integrating its self-developed Kunlun chips with open-source models, adopting a full-stack autonomous approach to enhance its competitive edge [17].