Anthropic是如何构建多智能体系统的？

Core Viewpoint - Anthropic's multi-agent research system significantly enhances research capabilities by allowing multiple Claude agents to collaborate, achieving a performance improvement of 90.2% compared to using a single Claude Opus 4 agent, albeit at a cost of increased token usage [1][9][10]. Group 1: System Architecture and Performance - The multi-agent system consists of a main agent that analyzes user needs and creates several sub-agents to explore different dimensions of information simultaneously, drastically reducing research time from hours to minutes [1][15]. - The system's performance is heavily reliant on token usage, with multi-agent systems consuming tokens at a rate 15 times higher than standard chat interactions [10][11]. - The internal evaluation indicates that the multi-agent system excels in handling broad queries that require simultaneous exploration of multiple directions [9][28]. Group 2: Engineering Principles and Challenges - Eight engineering principles were identified during the development of the multi-agent system, emphasizing clear resource allocation, new evaluation methods, and the importance of state management in production environments [2][6][20]. - The system's architecture is based on an orchestrator-worker model, where the main agent coordinates the process and directs specialized sub-agents to work in parallel [12][15]. - Challenges include managing the complexity of coordination among agents, ensuring effective task distribution, and addressing the bottleneck caused by synchronous execution [35][36]. Group 3: User Applications and Insights - The most common use cases for the research functionality include developing cross-disciplinary software systems (10%), optimizing technical content (8%), and assisting in academic research (7%) [3][39]. - The insights gained from the development process provide valuable lessons for technology teams exploring AI agent applications, highlighting the importance of thoughtful engineering and design [3][6]. Group 4: Evaluation and Reliability - Evaluating multi-agent systems requires flexible methods that assess both the correctness of outcomes and the reasonableness of the processes used to achieve them [28][30]. - The use of LLMs as evaluators allows for scalable assessment of outputs based on criteria such as factual accuracy and tool efficiency [30][31]. - The system's reliability is enhanced through careful monitoring of decision patterns and interactions among agents, ensuring that small changes do not lead to significant unintended consequences [33][34].