智能体自己出现问题自己找！首次提出“自动化失败归因”课题

Core Insights - The article discusses the emerging field of "automated failure attribution" in LLM-driven Multi-Agent systems, highlighting the challenges of diagnosing failures in complex systems [2][5][18] - A new dataset called Who&When has been created to facilitate research in this area, containing failure logs from 127 Multi-Agent systems [8][9] - The research introduces three distinct automated attribution methods, each with its strengths and weaknesses, contributing to the initial "solution library" for failure attribution tasks [9][12] Group 1: Introduction to Automated Failure Attribution - LLM Multi-Agent systems have shown great potential but are vulnerable to failures due to individual agent errors and miscommunication [5][8] - The traditional debugging process is inefficient, often requiring manual examination of lengthy interaction logs [7][11] - There is a pressing need for an automated, systematic approach to identify failure causes and connect evaluation results with system improvements [7][18] Group 2: Contributions of the Research - The research formalizes the problem of "automated failure attribution," defining it as identifying the failure-responsible agent and the decisive error step [8][9] - The Who&When dataset includes diverse failure logs, ensuring authenticity and variety in scenarios [8][9] - Initial exploration of automated attribution methods has been conducted, with three methods designed and evaluated: All-at-Once, Step-by-Step, and Binary Search [9][10] Group 3: Method Evaluation and Findings - Experimental results indicate that current methods are far from perfect, with the best method achieving only about 53.5% accuracy in identifying responsible agents and 14.2% in pinpointing error steps [11][12] - Different methods excel in different sub-tasks, with All-at-Once being better for identifying agents, Step-by-Step for locating error steps, and Binary Search providing a balanced approach [12][13] - A hybrid method combining different strategies shows improved performance but at a higher computational cost [14][15] Group 4: Implications and Future Directions - The task of automated failure attribution is crucial for enhancing the reliability of Multi-Agent systems, transforming failure analysis from a complex puzzle into a quantifiable problem [18] - The research opens new avenues for improving the understanding of failure patterns in Multi-Agent systems, ultimately leading to more reliable and intelligent collaborative systems [18]