华为发布业界首个扩散语言模型Agent,部分场景提速8倍!
量子位·2026-02-10 05:33

Core Insights - The article emphasizes that the evaluation of an Agent's strength has shifted from merely answering questions to its ability to efficiently handle multi-turn reasoning, tool invocation, and complex collaboration with minimal interaction budget [2][3]. Group 1: Research Findings - A recent study by teams from Huawei and several universities demonstrated that switching to a Diffusion Large Language Model (DLLM) significantly enhances Agent performance, achieving over 30% faster execution speed and up to 8 times efficiency in complex tasks compared to traditional Autoregressive (AR) models [3][4]. - The research employed a strict controlled experimental design to ensure that differences in performance were solely attributed to the generation paradigm, revealing that DLLM Agents require fewer interaction rounds and tool calls while maintaining similar accuracy levels [4][5]. Group 2: Performance Metrics - In the BrowseComp-zh benchmark, the DLLM Agent achieved an accuracy of 15.5% with an average of 6.7 tool calls and 13.0 turns used, while the AR Agent had an accuracy of 7.5% with 14.8 tool calls and 1.9 turns used [8]. - The DLLM Agent exhibited a lower invalid action rate of 6.4%, indicating a more efficient execution process [8]. Group 3: Case Study - A specific case study highlighted that the DLLM Agent completed a complex multi-constraint retrieval task in 140.95 seconds, while the AR Agent took 1152.68 seconds, showcasing an 8.18 times speed difference [13][14]. Group 4: Planning and Execution - The DLLM Agent demonstrated superior planning capabilities by identifying key constraints quickly and refining task structures in a two-phase approach, contrasting with the AR Agent's sequential and often error-prone process [16][19]. - The study found that the DLLM's attention mechanism allows for a more coordinated global-to-local decision-making process, leading to faster task completion and fewer detours [28][30]. Group 5: Implications for Agent Design - The findings suggest that the generation paradigm fundamentally shapes Agent behavior, indicating that DLLM should not merely replace AR but requires a re-alignment of interfaces and training objectives to fully leverage its potential [24][30].

华为发布业界首个扩散语言模型Agent,部分场景提速8倍! - Reportify