混合推理模型

Search documents
DeepSeek-V3.1 发布,官方划重点:Agent、Agent、Agent!
Founder Park· 2025-08-21 08:16
编者荐语: DeepSeek V3.1 上架 2 天后,官方终于发了详细的介绍文档。这里简单划重点: 混合推理模型 上下文拓展至 128K Agent 能力增强,这次的重点 思考模型和非思考模型价格统一,目前看来还有下降区间 以下文章来源于赛博禅心 ,作者金色传说大聪明 赛博禅心 . 拜AI古佛,修赛博禅心 DEEPSEEK V3.1 正式发布 混合推理架构 思考/非思考模式合一 更高思考效率 更少 Token,更快响应 更强 Agent 能力 工具使用与智能体任务提升 MODEL UPDATES 核心架构与使用 混合推理架构: 单一模型支持思考与非思考双模式。 2025年08月21日 北京 • deepseek-chat 对应非思考模式。 • deepseek-reasoner 对应思考模式。 | | | R1-0528 | | --- | --- | --- | | Browsecomp | 30.0 | 8.9 | | Browsecomp_zh | 49.2 | 35.7 | | HLE | 29.8 | 24.8 | | xbench-DeepSearch | 71.2 | 55.0 | | Fra ...
MiniMax再融22亿元?新智能体可开发演唱会选座系统
Nan Fang Du Shi Bao· 2025-07-17 04:58
Group 1: Company Developments - MiniMax is reportedly nearing completion of a new financing round of nearly $300 million, which will elevate its valuation to over $4 billion [1] - MiniMax has launched the MiniMax Agent, a full-stack development tool that allows users to create complex web applications using natural language input without programming skills [1] - The MiniMax Agent can deliver various functionalities such as API integration, real-time data handling, payment processing, and user authentication [1] Group 2: Industry Trends - The Agent technology has emerged as a significant trend in the tech industry, following the success of products like Manus and Devin, with a focus on code capabilities and information retrieval [3] - Major companies like OpenAI and Google are competing in the development of advanced agents with strong programming capabilities [3] - The industry is shifting towards hybrid reasoning models, exemplified by Anthropic's release of the Claude 3.7 Sonnet, which combines fast and slow thinking processes [3] Group 3: Technological Innovations - MiniMax introduced the MiniMax-M1, the first open-source large-scale hybrid architecture reasoning model, which is efficient in processing long context inputs and deep reasoning [4] - The hybrid architecture is expected to become mainstream in model design due to increasing demands for deployment efficiency and low latency [4] - Future research in hybrid attention architectures is encouraged to explore diverse configurations beyond simple stacking of attention layers [4]
杭州致成电子科技有限公司:混合推理模型引领电力计量诊断新范式
Jin Tou Wang· 2025-05-29 00:49
Core Insights - The article highlights the significant role of precision diagnostics and intelligent operation and maintenance of power metering equipment in the context of China's "dual carbon" strategy and energy digital transformation [1][5] - The company, Hangzhou Zhicheng Electronics Technology Co., Ltd., has developed a hybrid reasoning model-based fault diagnosis platform for power metering equipment, achieving rapid growth in a niche market [1][2] Technological Breakthroughs - The company has innovatively integrated mechanism models with artificial intelligence to create a collaborative algorithm framework, addressing the inefficiencies of traditional diagnostic methods [2] - The platform offers three core functionalities: comprehensive analysis, precise fault localization, and tiered recommendations, significantly improving operational efficiency [2] - The application of this platform has led to a 35% reduction in equipment failure rates and a 28% decrease in line loss management costs for power grid companies, saving over 100 million yuan annually [2] Market Expansion - As of 2024, the company's diagnostic platform has covered 13 provinces, serving over 200 million users, which accounts for 34.33% of the national smart meter user base [3] - The company has established a strong presence in key markets like Zhejiang, where it serves millions of users, and is rapidly increasing its market penetration in energy-rich regions such as Sichuan and Gansu [3] Industry Empowerment - The company is evolving from a single product supplier to a full lifecycle solution provider, integrating its platform with major systems like the State Grid's "Online Grid" and Southern Grid's "Metering Automation System 3.0" [4] - The platform has facilitated over 20 innovative applications, including digital twin maps for low-voltage distribution networks, which have been successfully implemented and promoted across the network [4] Future Outlook - The company is accelerating its development in cutting-edge areas such as edge computing and digital twins, supported by resources and technology from China National Nuclear Corporation [5] - A new lightweight diagnostic terminal is set to be launched in 2024, enhancing localized AI reasoning capabilities, while a collaboration with Tsinghua University aims to improve fault diagnosis automation [5] - The company's rising market share reflects its technological strength and commitment to supporting China's "dual carbon" goals and the intelligent upgrade of the power grid [5]
阿里Qwen3大模型登顶开源冠军,中国AI应用即将迎来大爆发?
Sou Hu Cai Jing· 2025-05-01 18:34
Core Insights - Alibaba has officially launched the Qwen3 model, marking a significant breakthrough in the field of artificial intelligence, which has generated considerable excitement in the global tech community [3] - Qwen3 is noted for its exceptional efficiency and significantly reduced costs, being one-third the size of comparable models while outperforming top global models [3][20] - The model integrates "fast thinking" and "slow thinking" capabilities, allowing it to respond quickly to simple queries while engaging in deeper reasoning for complex problems, thus optimizing computational resource usage [3][21] Model Features - Qwen3 features a unique hybrid reasoning capability that allows it to switch between thinking and non-thinking modes to meet various scenario demands [20] - The model has shown significant improvements in reasoning abilities across mathematics, code generation, and common-sense logic, enhancing user interaction experiences [20] - Qwen3 supports 119 languages and dialects, greatly expanding its application range and accessibility for global developers and enterprises [20][38] Performance Metrics - In the AIME25 assessment, Qwen3 achieved a score of 81.5, setting a new record for open-source models [20] - The model surpassed 70 points in the LiveCodeBench evaluation, outperforming Grok3, and achieved a score of 95.6 in the ArenaHard assessment, exceeding OpenAI-o1 and DeepSeek-R1 [20][27] - Qwen3's performance is further highlighted by its ability to achieve high scores in various assessments, demonstrating its competitive edge in the AI landscape [27] Deployment and Adaptation - Following the open-source release of Qwen3, major chip manufacturers like NVIDIA, MediaTek, and AMD have successfully adapted the model for their systems [28][32] - Huawei announced support for the full series of Qwen3 models, enabling developers to utilize the model seamlessly in their applications [28][31] - The deployment cost has been significantly lowered, with only four H20 GPUs required to deploy the flagship version of Qwen3, making it more accessible for businesses [24] Model Variants - Qwen3 includes eight open-source models, featuring two MoE models (30B and 235B) and six dense models with varying parameter sizes, optimized for different application scenarios [24][25] - The 30B MoE model offers over ten times the performance leverage, while the dense models achieve high performance with reduced parameter counts [24][25] - Each model variant is tailored for specific use cases, from mobile applications to enterprise-level deployments, enhancing the versatility of Qwen3 [25] Open Source and Community Impact - Qwen3 is released under the Apache 2.0 license, allowing global developers and research institutions to freely download and commercialize the models [33] - The model's open-source nature is expected to accelerate the adoption of advanced AI technologies across various sectors, particularly in mobile, smart devices, and robotics [25][33] - The extensive language support and the ability to cater to diverse regional needs position Qwen3 as a leading choice for AI applications worldwide [36][38]
全球最强开源AI大模型诞生:中国研发,成本只有Deepseek的30%
Xin Lang Cai Jing· 2025-04-30 11:28
Core Insights - The release of OpenAI's ChatGPT has initiated a global competition in large AI models, leading to a surge in open-source models following the launch of Deepseek [1][3] - There are two main approaches in the AI model landscape: one focuses on high-performance models through extensive GPU resources, exemplified by OpenAI, while the other, like Deepseek, aims for efficiency with limited resources [3][5] - A new Chinese model, Qwen3 by Alibaba, has emerged as a significant player, boasting lower costs and superior performance compared to OpenAI's models and Deepseek's offerings, marking it as the top model globally [5][6] Performance and Cost Efficiency - Qwen3 is the world's first "hybrid reasoning model," integrating both "fast thinking" and "slow thinking" modes to handle varying complexities in tasks [5] - Qwen3 requires only one-third of the parameter scale of Deepseek-R1, resulting in a cost reduction of two-thirds while outperforming it [6][7] - The deployment of Qwen3 can be achieved with just four H20 GPUs, occupying only one-third of the memory of similar models, and its deployment cost is only 25% to 35% of the full version of Deepseek-R1 [7] Market Implications - The introduction of Qwen3 is expected to accelerate the domestic GPU replacement trend in China, as it demonstrates that powerful models can be deployed without the need for top-tier NVIDIA GPUs, challenging the existing market dynamics [9] - The success of Qwen3 may further enhance opportunities for domestic GPU manufacturers, as the demand for high-performance AI capabilities can be met with local alternatives [9]
华为昇腾全系列支持Qwen3
news flash· 2025-04-29 10:31
Core Insights - The article highlights the launch of Alibaba's Qwen3 model, which is the first "hybrid reasoning model" in China, integrating "fast thinking" and "slow thinking" into a single framework [1] - Huawei's Ascend supports the deployment of the Qwen3 model across its entire series, allowing developers to utilize it seamlessly in MindSpeed and MindIE [1] - The Qwen3 model is designed to provide quick responses for simple queries with low computing power while enabling multi-step deep reasoning for complex questions, significantly reducing computational resource consumption [1]
Qwen3深夜炸场,阿里一口气放出8款大模型,性能超越DeepSeek R1,登顶开源王座
3 6 Ke· 2025-04-29 09:53
Core Insights - The release of Qwen3 marks a significant advancement in open-source AI models, featuring eight hybrid reasoning models that rival proprietary models from OpenAI and Google, and surpass the open-source DeepSeek R1 model [4][24]. - Qwen3-235B-A22B is the flagship model with 235 billion parameters, demonstrating superior performance in various benchmarks, particularly in software engineering and mathematics [2][4]. - The Qwen3 series introduces a unique dual reasoning mode, allowing the model to switch between deep reasoning for complex problems and quick responses for simpler queries [8][21]. Model Performance - Qwen3-235B-A22B achieved a score of 95.6 in the ArenaHard test, outperforming OpenAI's o1 (92.1) and DeepSeek's R1 (93.2) [3]. - Qwen3-30B-A3B, with 30 billion parameters, also shows strong performance, scoring 91.0 in ArenaHard, indicating that smaller models can still achieve competitive results [6][20]. - The models have been trained on approximately 36 trillion tokens, nearly double the data used for the previous Qwen2.5 model, enhancing their capabilities across various domains [17][18]. Model Architecture and Features - Qwen3 employs a mixture of experts (MoE) architecture, activating only about 10% of its parameters during inference, which significantly reduces computational costs while maintaining high performance [20][24]. - The series includes six dense models ranging from 0.6 billion to 32 billion parameters, catering to different user needs and computational resources [5][6]. - The models support 119 languages and dialects, broadening their applicability in global contexts [12][25]. User Experience and Accessibility - Qwen3 is open-sourced under the Apache 2.0 license, making it accessible for developers and researchers [7][24]. - Users can easily switch between reasoning modes via a dedicated button on the Qwen Chat website or through commands in local deployments [10][14]. - The model has received positive feedback from users for its quick response times and deep reasoning capabilities, with notable comparisons to other models like Llama [25][28]. Future Developments - The Qwen team plans to focus on training models capable of long-term reasoning and executing real-world tasks, indicating a commitment to advancing AI capabilities [32].
性能超越DeepSeek R1,Qwen3正式登场!阿里一口气放出8款大模型,登顶开源王座!
AI科技大本营· 2025-04-29 09:05
整理 | 屠敏 出品 | CSDN(ID:CSDNnews) 今天凌晨,大模型领域最受关注的重磅消息来自 阿里 Qwen 团队——他们正式发布了备受期待的全 新 Qwen3 系列 大模型。 8 大模型齐发! 这 8 款混合推理模型中,包括了 2 个 MOE 模型: Qwen3-235B-A22B 和 Qwen3-30B-A3B 。 其中,Qwen3-235B-A22B 是本次发布中规模最大的旗舰模型,拥有 2350 亿个参数,激活参数超 过 220 亿。 在代码、数学和通用能力等多个基准测试中,它的表现不仅超过了 DeepSeek 的 R1 开源模型,还 优于 OpenAI 的闭源模型 o1。尤其在软件工程和数学领域的 ArenaHard 测试(共 500 道题)中, 成绩甚至接近了 Google 最新发布的 Gemini 2.5-Pro,可见其实力不容小觑。 不同于以往,这次其一次性开源了多达 8 款混合推理模型,在性能上全面逼近 OpenAI、Google 等 闭源大模型,以及超越了开源大模型 DeepSeek R1,堪称当前最强的开源模型之一,也难怪昨晚 Qwen 团队一直在加班。 | | Qwen3- ...
通义千问 Qwen3 发布,对话阿里周靖人
晚点LatePost· 2025-04-29 08:43
以下文章来源于晚点对话 ,作者程曼祺 晚点对话 . 最一手的商业访谈,最真实的企业家思考。 阿里云 CTO、通义实验室负责人 周靖人 "大模型已经从早期阶段的初期,进入早期阶段的中期,不可能只在单点能力上改进了。" Qwen3 旗舰模型,MoE(混合专家模型)模型 Qwen3-235B-A22B,以 2350 亿总参数、220 亿激活参数,在 多项主要 Benchmark(测评指标)上超越了 6710 亿总参数、370 亿激活参数的 DeepSeek-R1 满血版。更小 的 MoE 模型 Qwen3-30B-A3B,使用时的激活参数仅为 30 亿,不到之前 Qwen 系列纯推理稠密模型 QwQ- 32B 的 1/10,但效果更优。更小参数、更好性能,意味着开发者可以用更低部署和使用成本,得到更好效 果。图片来自通义千问官方博客。 (注:MoE 模型每次使用时只会激活部分参数,使用效率更高,所以有 总参数、激活参数两个参数指标。) Qwen3 发布前,我们访谈了阿里大模型研发一号位,阿里云 CTO 和通义实验室负责人,周靖人。他 也是阿里开源大模型的主要决策者。 迄今为止,Qwen 系列大模型已被累计下载 3 ...
阿里开源通义千问Qwen3:登顶全球最强开源模型,成本仅需DeepSeek-R1三分之一
IPO早知道· 2025-04-29 03:01
性能超越DeepSeek-R1、OpenAI-o1。 本文为IPO早知道原创 作者| Stone Jin 微信公众号|ipozaozhidao 据 IPO早知道消息, 阿里巴巴 于 4月29日凌晨开源新一代通义千问模型Qwen3(简称千问3), 参数量仅为DeepSeek-R1的1/3,成本大幅下降,性能全面超越R1、OpenAI-o1等全球顶尖模 型,登顶全球最强开源模型 。千问 3是国内首个"混合推理模型","快思考"与"慢思考"集成进同一 个模型,对简单需求可低算力"秒回"答案,对复杂问题可多步骤"深度思考",大大节省算力消耗。 千问 3采用混合专家(MoE)架构,总参数量235B,激活仅需22B。千问3预训练数据量达36T , 并在后训练阶段多轮强化学习,将非思考模式无缝整合到思考模型中。千问3在推理、指令遵循、工 具调用、多语言能力等方面均大幅增强,即创下所有国产模型及全球开源模型的性能新高:在奥数水 平的AIME25测评中,千问3斩获81.5分,刷新开源纪录;在考察代码能力的LiveCodeBench评测 中,千问3突破70分大关,表现甚至超过Grok3;在评估模型人类偏好对齐的ArenaHard ...