Workflow
Qwen3系列模型
icon
Search documents
破解大模型「无效并行推理」:Parallel-Probe问世,并行推理效率提升35.8%
机器之心· 2026-03-07 04:20
Core Insights - The article discusses the emergence of Parallel Thinking in large model inference and highlights the inefficiencies in current parallel reasoning methods, particularly the waste of computational resources on unnecessary paths [2][5][12]. Group 1: Research Findings - A research team from various universities introduced Parallel-Probe, a training-free parallel reasoning control algorithm that significantly reduces unnecessary computations while maintaining core accuracy [2][4]. - The study revealed that global consensus often stabilizes before all branches of reasoning are completed, leading to inefficiencies where long-tail paths consume excessive computational resources [2][12]. - The research identified three key dynamic features of parallel reasoning: non-monotonic scaling of accuracy with computational power, significant differences in path lengths, and early stabilization of consensus [8][12]. Group 2: Mechanisms of Parallel-Probe - Parallel-Probe employs two core mechanisms: Consensus-based Early Stopping, which terminates reasoning once a stable majority answer is detected, and Deviation-based Branch Pruning, which eliminates paths that deviate significantly from the global trend [13]. - The algorithm achieved a 35.8% reduction in inference latency and a 25.8% decrease in total token costs without sacrificing accuracy [2]. Group 3: Experimental Results - Experimental results demonstrated that Parallel-Probe establishes a better balance between performance, cost efficiency, and latency efficiency compared to existing methods like ESC and SC [14]. - The article provides detailed performance metrics across various models, showing that Parallel-Probe consistently outperforms traditional methods in terms of accuracy and resource utilization [14]. Group 4: Infrastructure Contribution - The team introduced SCOUT, a testing platform that decouples inference generation from control strategies, allowing developers to simulate various scaling strategies with minimal overhead, thus enhancing testing efficiency [15][16].
计算机行业AI2026算力系列(二):从云业务到千问APP,阿里算力需求保持旺盛态势
GF SECURITIES· 2026-01-30 07:10
Investment Rating - The industry investment rating is "Buy" [3] Core Insights - Alibaba is expected to increase its investment in AI infrastructure, with capital expenditure on AI and cloud infrastructure reaching approximately 120 billion RMB over the past four quarters as of November 2025. This investment is anticipated to rise from 380 billion to 480 billion RMB over the next three years [7][10] - Alibaba Cloud's revenue for Q3 2025 was 39.8 billion RMB, reflecting a year-on-year growth of 34.5%, driven by strong demand for AI products, which have seen triple-digit growth for nine consecutive quarters [12][10] - The Qwen3 series of models has undergone multiple iterations, with the latest Qwen3-Max-Thinking model launched in January 2026, enhancing adaptive tool usage capabilities and performing comparably to leading models like GPT-5.2-Thinking [21][22] Summary by Sections Section 1: AI Demand and Infrastructure Investment - Alibaba's AI and cloud infrastructure investment is projected to significantly increase, with a current expenditure of around 120 billion RMB and plans to boost future investments to 480 billion RMB [7][10] - The synergy between AI investments and cloud business revenue growth is evident, with Alibaba Cloud holding a 35.8% market share in China's AI cloud market as of the first half of 2025 [12][10] Section 2: Product Development and Market Position - The Qwen3 series models have shown impressive performance in various benchmarks, indicating a strong demand for training computational power as these models continue to evolve [21][22] - The launch of the Qwen3-Max-Thinking model has further solidified Alibaba's position in the AI market, showcasing capabilities that rival top competitors [21][22] Section 3: Investment Recommendations - The report suggests focusing on companies that may benefit from Alibaba's increased investment in AI infrastructure, including NetEase Technology, Cambricon, Inspur Information, and Unisplendour [28] - The rapid integration of the Qianwen App into Alibaba's ecosystem is expected to drive user growth and enhance demand for AI computational resources [16][28]
财经观察:DeepSeek一周年,中美AI之路再对比
Huan Qiu Shi Bao· 2026-01-14 22:51
Core Insights - DeepSeek, a Chinese AI startup, is set to launch its next-generation AI model V4 in mid-February, which is expected to outperform competitors like Anthropic's Claude and OpenAI's GPT series [1] - The rapid development of AI in China has narrowed the gap with the US, with experts noting that the progress made in just one year is significant [1][2] Group 1: Company Developments - DeepSeek's R1 model was launched last year and completed training in just two months at a fraction of the cost incurred by US companies, achieving comparable performance to ChatGPT and Meta's Llama [2] - Chinese open-source AI models account for nearly 30% of global AI technology usage, with companies like Airbnb and Meta utilizing models developed by Alibaba [3] - Alibaba has released nearly 400 open-source models, with over 18 million derivatives and 700 million downloads, showcasing its significant role in the global AI landscape [3] Group 2: Competitive Landscape - The US AI strategy focuses on high-performance closed-source models and platform products, while China emphasizes open-source models and rapid industrial application [4] - While the US leads in cutting-edge model capabilities, China excels in engineering efficiency and speed of deployment, with no significant time lag in these areas [5] Group 3: Future Trends - The next significant advancements in AI are expected to occur in areas such as humanoid robots integrated with large models, industrial applications, and breakthroughs in low-cost inference and edge computing [10] - The AI toy industry is projected to reach a milestone of 1 million units sold, which will generate substantial interaction data, enhancing model capabilities and establishing AI toys as essential daily items [11]
基于文本AI的终结?Agent协作可直接「复制思维」,Token效率暴涨
机器之心· 2025-12-05 04:08
Core Insights - The article discusses the emergence of multi-agent systems (MAS) in the Agentic AI era, emphasizing the shift from individual models to collaborative problem-solving among AI agents [2][5] - A new framework called LatentMAS is introduced, which allows agents to collaborate in latent space rather than through traditional text communication, enhancing efficiency and performance [5][14] Group 1: LatentMAS Framework - LatentMAS enables agents to exchange internal hidden layer representations and KV-cache working memory, resulting in higher performance and reduced token usage [5][10] - The framework is designed to support richer latent reasoning and lossless communication between agents, significantly lowering computational complexity compared to text-based MAS [15][16] Group 2: Experimental Results - Comprehensive experiments on nine benchmark tasks show that LatentMAS outperforms both single models and text-based MAS, with accuracy improvements of up to 14.6% and token usage reductions of 70.8% to 83.7% [6][20][22] - LatentMAS achieves end-to-end reasoning speed increases of 4× to 4.3× compared to traditional methods, demonstrating its efficiency [21][25] Group 3: Efficiency and Performance - The framework allows for complex reasoning processes while significantly reducing the number of tokens used, achieving higher accuracy with fewer output tokens [28][29] - LatentMAS can provide additional speed improvements of 2.6× to 7× over text-based MAS, even when the latter is optimized with vLLM services [25][28] Group 4: Semantic Richness - The latent representations generated by LatentMAS are shown to be semantically rich and diverse, surpassing the expressiveness of discrete tokens used in text-based systems [30][31] - The study indicates that the potential reasoning captured in LatentMAS is not only effective but also contains more nuanced internal representations compared to traditional methods [31][32]
刚刚,Thinking Machines Lab博客提出在策略蒸馏,Qwen被cue 38次
3 6 Ke· 2025-10-28 02:00
Core Insights - Thinking Machines Lab (TML) has introduced a new training method called on-policy distillation, which combines reinforcement learning (RL) error correlation with supervised fine-tuning (SFT) reward density, achieving superior performance at a lower cost [1][17]. Group 1: Methodology and Applications - On-policy distillation is effective for small models, enhancing their domain performance and continuous learning capabilities [1][17]. - The method is inspired by the Qwen team’s research and heavily utilizes the Qwen3 series models during experiments [3][34]. - The training process consists of three stages: pre-training, mid-training, and post-training, focusing on general capabilities, domain knowledge, and target behavior respectively [6][7]. Group 2: Advantages of On-Policy Distillation - Small models trained with on-policy distillation often outperform larger general models in specialized fields due to benefits like local deployment, easier continuous training, and reduced inference costs [7][17]. - The method provides dense reward signals, allowing for more efficient learning compared to traditional RL, which offers sparse feedback [9][18]. Group 3: Performance and Cost Efficiency - TML's experiments show that on-policy distillation can achieve performance comparable to RL at a fraction of the cost, with reported costs being only one-tenth of traditional RL methods [34][41]. - The method has demonstrated significant computational efficiency, requiring 7-10 times fewer gradient steps to achieve similar performance levels as RL [58]. Group 4: Continuous Learning and Personalization - On-policy distillation is positioned as a promising tool for continuous learning, allowing models to update without degrading previously learned behaviors [66][70]. - The approach can effectively personalize models, enabling them to adapt to specific tasks while retaining core capabilities [42][53].
刚刚,Thinking Machines Lab博客提出在策略蒸馏,Qwen被cue 38次
机器之心· 2025-10-28 00:41
Core Viewpoint - Thinking Machines Lab (TML) has introduced a new training method called on-policy distillation, which combines reinforcement learning (RL) error correlation with supervised fine-tuning (SFT) reward density, achieving superior performance at a lower cost compared to other methods [1][2][27]. Group 1: Methodology and Advantages - On-policy distillation allows small models to exhibit strong domain performance and continuous learning capabilities [1][2]. - The training process is divided into three stages: pre-training for general capabilities, mid-training for domain knowledge, and post-training for guiding target behaviors [6][7]. - On-policy training samples trajectories from the student model itself, providing direct feedback to avoid errors, while off-policy training relies on external sources [8][9][12]. Group 2: Comparison with Other Methods - On-policy distillation combines the advantages of on-policy training's reliability and the dense reward signals from SFT, making it a cost-effective alternative to traditional RL methods [28][92]. - In experiments, on-policy distillation achieved a score of 74.4% on the AIME'24 benchmark with significantly lower computational costs compared to RL, which required 17,920 GPU hours for a score of 67.6% [47][46]. Group 3: Applications and Future Directions - The method has been successfully applied to train models for mathematical reasoning and to develop assistant models with domain knowledge and instruction-following capabilities [26][27]. - TML aims to continue exploring new applications of on-policy distillation, improving teacher supervision methods, and enhancing data efficiency and continuous learning [92][93].
英伟达千亿豪赌OpenAI;混沌HDDI商业智能体亮相云栖;红杉揭秘95%企业AI应用失败真相 | 混沌AI一周焦点
混沌学园· 2025-09-28 11:58
Core Insights - The article discusses the introduction of the HDDI, an AI-driven consulting tool by Hundun, aimed at transforming business strategy decision-making and making professional consulting services more accessible to small and medium enterprises [2][3]. Group 1: HDDI Features and Functionality - HDDI integrates Hundun's unique innovation theory framework and a decade's worth of case studies, functioning like a real consulting advisor [3]. - It shifts the business service model from a one-time project basis to a subscription-based partnership, providing continuous strategic support [3]. - The tool can help decision-makers identify core issues through guided conversations and generate comprehensive analysis reports within minutes, including feasibility assessments and implementation paths [6]. Group 2: AI Trends and Market Dynamics - Sequoia Capital's research indicates a "productivity paradox" with only 5% of companies deriving significant value from generative AI, while 95% see minimal benefits due to static tools that fail to integrate deeply into business processes [8]. - The AI landscape is witnessing a shift where AI is replacing entry-level jobs, emphasizing the importance of experienced employees' tacit knowledge as a competitive advantage [8]. - The article highlights the need for entrepreneurs to develop AI agents that can learn and integrate into backend processes, moving towards a business outcome-based pricing model [8]. Group 3: Major Industry Developments - Nvidia's strategic partnership with OpenAI involves an investment of up to $100 billion to build AI data centers, marking a significant advancement in AI infrastructure [17][23]. - The launch of the Dimensity 9500 chip by MediaTek represents a breakthrough in edge AI capabilities, with a 111% performance increase and a 56% reduction in power consumption [19][24]. - The article emphasizes the competitive landscape where large companies are integrating AI into their core products, creating new opportunities for startups to provide specialized AI solutions [20].
数字经济双周报(202507第2期)-20250801
Yin He Zheng Quan· 2025-08-01 10:37
Group 1: US AI Action Plan - The US AI Action Plan aims to establish global leadership in AI, focusing on "innovation-driven" and "deregulation" strategies to enhance market vitality and reduce development barriers[5] - Key policies include accelerating AI innovation, building AI infrastructure, and leading global AI order, with over 90 specific administrative orders outlined[6] - The plan emphasizes the importance of ensuring American workers benefit from AI advancements, creating high-paying jobs through infrastructure development[5] Group 2: Risks and Challenges for China - The US views China as its strongest competitor in AI, leading to risks such as deepening technology blockades and increased supply chain vulnerabilities, particularly in AI chip markets where Nvidia holds a 66% market share in China[9] - China's AI development may face fragmentation in industry standards and open-source barriers as the US promotes a "full-stack AI package" to expand its technological influence globally[13] - The US's focus on AI infrastructure and energy competition may create a technological gap between the US and China, impacting China's AI capabilities[16] Group 3: Global AI Governance and Cooperation - China has proposed the "Global AI Governance Action Plan," advocating for an inclusive and sustainable global AI governance system, emphasizing cooperation among developing countries[19] - The plan includes 13 key tasks, such as technology innovation and data governance, aiming to establish a unified international rule-making framework[20] - Local policies in China are accelerating the development of regional data industry systems, with Jiangxi province targeting a 20% annual growth in data markets by 2027[21] Group 4: AI Infrastructure Investments - Major US companies, including Google and Meta, are investing significantly in AI infrastructure, with Google planning to invest $25 billion in data centers and AI facilities across 13 states[33] - Trump's administration announced a $90 billion investment plan for AI and energy infrastructure, focusing on new data centers and power generation facilities[31] - The National Science Foundation (NSF) is collaborating with Voltage Park to provide 1 million hours of high-end GPU cloud computing resources for AI research[35]
数字经济双周报(202507第2期)-20250731
Yin He Zheng Quan· 2025-07-31 10:00
Group 1: US AI Action Plan - The US AI Action Plan aims to establish global leadership in AI, focusing on "innovation-driven" and "deregulation" strategies to enhance market vitality and reduce development barriers[5] - Key policies include accelerating AI innovation, building AI infrastructure, and leading global AI order, with over 90 specific administrative orders outlined[6] - The plan emphasizes the importance of ensuring American workers benefit from AI advancements, creating high-paying jobs through infrastructure development[5] Group 2: Risks and Challenges for China - China faces risks of deepening technology blockades, with Nvidia holding a 66% market share in China's AI chip market, indicating reliance on US technology[9] - The US aims to export a "full-stack AI package" to allies, potentially sidelining Chinese technologies and creating a fragmented global AI ecosystem[13] - Infrastructure gaps in AI capabilities may widen, as the US accelerates data center and energy infrastructure development to meet AI demands[16] Group 3: Global AI Governance and Cooperation - China released the "Global AI Governance Action Plan," advocating for an inclusive and sustainable global AI governance framework, emphasizing cooperation among developing countries[19] - The plan includes 13 key tasks, such as technology innovation and data governance, aiming to unify international rules and enhance participation from the Global South[20] - Local policies in China are rapidly emerging to build regional data industry systems, with Jiangxi aiming for a 20% annual growth in data industries by 2027[21] Group 4: AI Infrastructure Investments - Major US companies, including Google and Meta, are investing significantly in AI infrastructure, with Google planning to invest $25 billion in data centers and AI facilities[33] - Trump's administration announced a $90 billion investment plan for AI and energy infrastructure, focusing on new data centers and power generation facilities[31] - The establishment of the National AI Research Resource (NAIRR) aims to provide open AI research resources, enhancing collaboration in scientific fields[35]
整个HuggingFace榜,已经被中国AI模型一统江湖了。
数字生命卡兹克· 2025-07-31 01:06
Core Viewpoint - The article highlights a significant shift in the AI landscape, where domestic models in China are rapidly being open-sourced while overseas models are increasing in price and becoming less accessible [3][4][54]. Group 1: Open-source Models - Numerous Chinese companies have been actively open-sourcing their AI models, including MiniMax, Kimi, Qwen, and others [1]. - The top ten models on Hugging Face are all Chinese open-source models, with notable mentions such as Zhiyu GLM-4.5 at the top and Qwen holding five positions [8][9]. - The article emphasizes the rapid development and release of various models over a short period, showcasing the strength of domestic open-source efforts [11][12]. Group 2: Recent Model Releases - Tencent released the Hunyuan A13B model on June 27, featuring 80 billion total parameters and 13 billion active parameters [17][18]. - Baidu's ERNIE 4.5 was officially open-sourced on June 30, offering both pure LLM and multimodal capabilities [20]. - Alibaba's Tongyi launched the first CoT audio model, ThinkSound, on July 1, aimed at video dubbing [21]. - Zhiyu introduced the GLM-4.1V-Thinking model on July 2, which received positive evaluations for its performance [23]. - Kunlun Wanwei released the Skywork-Reward-V2 series on July 4, comprising eight reward models with parameters ranging from 600 million to 8 billion [25][26]. - The MOSS-TTSD model was open-sourced by Qiu Xipeng's team on July 5, trained on a million hours of audio [27]. - Ant Group's KAG-Thinker model, focused on interactive reasoning, was released on July 8 [32]. - The Intern-S1 model, a multimodal model, was launched by the Shanghai AI Lab on July 26 [41]. - Qwen's series of models, including Qwen3-235B and Qwen3-Coder, were released throughout July, achieving high rankings on the Hugging Face leaderboard [37][38][39]. Group 3: Industry Impact - The article reflects on the transformation of the AI landscape over the past two years, noting that China has moved from being a follower to a leader in open-source AI models [11][56]. - The ongoing trend of open-sourcing in China contrasts sharply with the increasing restrictions and pricing of models from overseas companies [54][55]. - The author concludes that this period marks the beginning of a new era for domestic AI models and the Chinese open-source community [56].