周靖人署名，通义实验室开源智能体自进化系统：让模型学会“自我反思”，14B也能越级打怪

Core Insights - The article discusses the launch of AgentEvolver, a self-evolving intelligent agent system developed by Alibaba, which significantly enhances the performance of AI models in complex tasks [2][4]. Performance Improvement - AgentEvolver has improved the average completion rate of a 14B model from 29.8% to 57.6%, nearly doubling its performance [4]. - In a smaller 7B model, the average completion rate increased from 15.8% to 45.2%, demonstrating the framework's versatility across different model sizes [5]. - The system has shown the ability to outperform larger models (e.g., 32B models) in specific tasks after optimization [5]. Learning Efficiency - AgentEvolver exhibits rapid convergence in learning efficiency, requiring significantly fewer training steps to reach 90% of baseline model performance—55.6% fewer steps in AppWorld tasks and 66.7% fewer in BFCL tasks [7][8]. - This efficiency leads to reduced training time and computational costs [8]. Cross-Domain Generalization - Models trained on synthetic data maintain high performance when applied to new, unseen domains, indicating strong cross-domain generalization capabilities [9][11]. - For instance, a model trained on AppWorld tasks performed well on BFCL tasks with minimal performance degradation [10]. Self-Evolution Mechanism - AgentEvolver utilizes a data-exploration-feedback automated process to achieve self-evolution, driven by three core mechanisms: self-questioning, self-navigating, and self-attributing [13][20]. - The self-questioning mechanism allows the system to generate challenging tasks autonomously, breaking reliance on external data [21][23]. - The self-navigating mechanism enhances exploration efficiency by leveraging past experiences to guide current decision-making [24][28]. - The self-attributing mechanism provides fine-grained feedback on each action taken, improving sample efficiency in strategy optimization [30][33].