Workflow
Agent Foundation Model (AFM)模型
icon
Search documents
Chain-of-Agents: OPPO推出通用智能体模型新范式,多榜单SOTA,模型代码数据全开源
机器之心· 2025-08-23 04:42
Core Insights - The article introduces a novel agent reasoning paradigm called Chain-of-Agents (CoA), which enhances multi-agent collaboration and efficiency compared to traditional multi-agent systems (MAS) [2][6][36] - CoA allows for dynamic activation of multiple roles and tools within a single model, facilitating end-to-end multi-agent collaboration without complex prompt and workflow designs [6][36] Limitations of Traditional MAS - High computational costs due to frequent redundant communication and complex workflow designs [3] - Limited generalization ability requiring extensive prompt design and workflow configuration for new tasks [3] - Lack of data-driven learning capabilities, making it difficult to improve performance through task data [3] Advantages of CoA and AFM - CoA reduces communication overhead and supports end-to-end training, significantly improving system efficiency and generalization capabilities [6][36] - The Agent Foundation Model (AFM) demonstrates superior performance across nearly 20 complex tasks, achieving a 55.4% success rate on the GAIA benchmark with a 32B model [6][24] - AFM reduces reasoning costs (token consumption) by up to 85.5% while maintaining leading performance [6] CoA Architecture - CoA features a hierarchical agent architecture with two core components: role-playing agents (Thinking, Planning, Reflection, Verification) and tool agents (Search, Crawl, Code) [10][13] - The framework supports diverse agent reasoning and task execution types [10] Training Framework - A specialized CoA fine-tuning framework is developed to build AFM, involving task data collection, multi-agent capability distillation, supervised fine-tuning, and reinforcement learning [11][14] - Approximately 87,000 structured task-solving trajectories were generated for training [15] Experimental Validation - AFM models exhibit robust performance in multi-hop question answering (MHQA) tasks, achieving new benchmarks across various datasets [19][22] - In mathematical reasoning tasks, AFM-RL-32B achieved an average accuracy of 78.0%, outperforming existing models [26] Efficiency Analysis - AFM shows significant advantages in tool calling efficiency and reasoning costs, requiring fewer tool calls and lower token consumption per successful task [31][33] - The model's performance in test-time scaling is validated across multiple benchmarks, demonstrating robust generalization and reasoning capabilities [31] Future Directions - Potential exploration of dynamic role generation capabilities to enhance adaptability to unknown tasks [39] - Integration of cross-modal tool fusion to expand application scenarios beyond text-based tools [39] - Development of efficient memory mechanisms for long-term tasks to reduce repetitive reasoning costs [39]