自动化失败归因

Search documents
智能体自己出现问题自己找!首次提出“自动化失败归因”课题 | ICML2025 Spotlight
量子位· 2025-06-11 02:27
Core Insights - The article discusses the emerging field of "automated failure attribution" in LLM-driven Multi-Agent systems, highlighting the challenges of diagnosing failures in complex systems [2][5][18] - A new dataset called Who&When has been created to facilitate research in this area, containing failure logs from 127 Multi-Agent systems [8][9] - The research introduces three distinct automated attribution methods, each with its strengths and weaknesses, contributing to the initial "solution library" for failure attribution tasks [9][12] Group 1: Introduction to Automated Failure Attribution - LLM Multi-Agent systems have shown great potential but are vulnerable to failures due to individual agent errors and miscommunication [5][8] - The traditional debugging process is inefficient, often requiring manual examination of lengthy interaction logs [7][11] - There is a pressing need for an automated, systematic approach to identify failure causes and connect evaluation results with system improvements [7][18] Group 2: Contributions of the Research - The research formalizes the problem of "automated failure attribution," defining it as identifying the failure-responsible agent and the decisive error step [8][9] - The Who&When dataset includes diverse failure logs, ensuring authenticity and variety in scenarios [8][9] - Initial exploration of automated attribution methods has been conducted, with three methods designed and evaluated: All-at-Once, Step-by-Step, and Binary Search [9][10] Group 3: Method Evaluation and Findings - Experimental results indicate that current methods are far from perfect, with the best method achieving only about 53.5% accuracy in identifying responsible agents and 14.2% in pinpointing error steps [11][12] - Different methods excel in different sub-tasks, with All-at-Once being better for identifying agents, Step-by-Step for locating error steps, and Binary Search providing a balanced approach [12][13] - A hybrid method combining different strategies shows improved performance but at a higher computational cost [14][15] Group 4: Implications and Future Directions - The task of automated failure attribution is crucial for enhancing the reliability of Multi-Agent systems, transforming failure analysis from a complex puzzle into a quantifiable problem [18] - The research opens new avenues for improving the understanding of failure patterns in Multi-Agent systems, ultimately leading to more reliable and intelligent collaborative systems [18]
ICML 2025 Spotlight | 谁导致了多智能体系统的失败?首个「自动化失败归因」研究出炉
机器之心· 2025-05-30 03:28
问题来了:到底是哪个 Agent 出了错?又是在对话流程的哪一环节?调试这样的多智能体系统如同大海捞针,需要翻阅大量复杂日志,极其耗时。 这并非虚构。在多智能体 LLM 系统中,失败常见但难以诊断。随着这类系统愈加普及,我们急需新方法快速定位错误。正因如此,ICML 2025 的一篇 Spotlight 论 文提出了「自动化失败归因(Automated Failure Attribution)」的新研究方向,目标是让 AI 自动回答:是谁、在哪一步导致了失败。 该工作由 Penn State、Duke、UW、Goolge DeepMind 等机构的多位研究人员合作完成。 论文标题:Which Agent Causes Task Failures and When? On Automated Failure Attribution of LLM Multi-Agent Systems 背景挑战 LLM 驱动的多智能体系统在诸多领域展现出巨大潜力,从自动化助手协同办公到多 Agent 合作完成 Web 复杂操作等。然而,这些系统 脆弱性 也逐渐显现:多个 Agent 之间的误解、信息传递错误或决策不当,都可能导致 ...