经验驱动的终身学习
Search documents
AI牛马实现“干中学”!上海AI Lab联合推出智能体自我进化新框架
量子位· 2025-10-21 23:50
Core Viewpoint - The article discusses the introduction of the MUSE framework, which aims to enhance the capabilities of LLM agents by enabling them to accumulate experience and evolve continuously, addressing the challenges of long-horizon tasks and memory limitations [1][5]. Group 1: MUSE Framework Overview - MUSE stands for Memory-Utilizing and Self-Evolving, designed to create a closed-loop system for LLM agents that allows them to learn from experience and evolve over time [5]. - The framework consists of a hierarchical memory module that organizes different levels of experience, including strategic, procedural, and tool memory [7][8]. Group 2: Key Mechanisms of MUSE - The first step involves a hierarchical memory module that allows agents to retain and apply historical knowledge, overcoming the "forgetfulness" of traditional LLMs [7]. - The second step is self-reflection, where agents evaluate their task execution and convert raw execution trajectories into structured experiences, refining their standard operating procedures (SOPs) [10][11]. - The third step focuses on self-evolution, enabling agents to continuously improve through a cycle of planning, execution, reflection, and experience extraction [13][15]. Group 3: Experimental Results - MUSE demonstrated state-of-the-art (SOTA) performance in the TAC benchmark, achieving a score of 51.78%, surpassing existing methods that used larger models [16]. - The framework's ability to accumulate experience leads to improved performance over time, showcasing its potential for long-term productivity tasks [19]. Group 4: Future Prospects - The MUSE framework signifies a new phase of experience-driven lifelong learning for AI agents, moving beyond static testing models [29]. - Future research directions include optimizing memory, enriching experience sources, integrating human feedback, and developing comprehensive evaluation standards for long-term tasks [30][31].