多智能体在「燃烧」Token！Anthropic公开发现的一切

Core Insights - Anthropic's new research on multi-agent systems highlights the advantages of using multiple AI agents for complex research tasks, emphasizing their ability to adapt and explore dynamically [2][3][6][7]. Multi-Agent System Advantages - Multi-agent systems excel in research tasks that require flexibility and the ability to adjust methods based on ongoing discoveries, as they can operate independently and explore various aspects of a problem simultaneously [7][8]. - Anthropic's internal evaluations show that their multi-agent system outperforms single-agent systems by 90.2% in breadth-first query tasks [8]. - The architecture allows for efficient token consumption, with multi-agent systems demonstrating a significant performance boost compared to single-agent models [9][10]. System Architecture - The multi-agent architecture follows a "coordinator-worker" model, where a lead agent coordinates tasks among several specialized sub-agents [14][18]. - The lead agent analyzes user queries, creates sub-agents, and oversees their independent exploration of different aspects of the query [19][21]. Performance Evaluation - Traditional evaluation methods are inadequate for multi-agent systems due to their non-linear and varied paths to achieving results; flexible evaluation methods are necessary [44][45]. - Anthropic employs a "LLM-as-judge" approach for evaluating outputs, which enhances scalability and practicality in assessing the performance of multi-agent systems [49][53]. Engineering Challenges - The complexity of maintaining state in intelligent agent systems poses significant engineering challenges, as minor changes can lead to substantial behavioral shifts [56][61]. - Anthropic has implemented robust debugging and tracking mechanisms to diagnose and address failures in real-time [57]. Conclusion - Despite the challenges, multi-agent systems have shown immense potential in open-ended research tasks, provided they are designed with careful engineering, thorough testing, and a deep understanding of current AI capabilities [61].