Workflow
智能体上下文工程
icon
Search documents
斯坦福新论文:微调已死,自主上下文当立
量子位· 2025-10-10 11:24
Core Insights - The article discusses a new research study that challenges traditional fine-tuning methods in AI, proposing a novel approach called Adaptive Context Engineering (ACE) that allows models to improve without retraining [2][3]. Group 1: ACE Framework - ACE operates by allowing context to evolve autonomously, generating, reflecting, and editing its own prompts to create a self-improving system [5]. - The framework addresses two major issues in traditional context adaptation: simplification bias, which leads to loss of critical details, and context collapse, where useful information is diminished through repeated modifications [10][11]. - ACE treats context as a dynamic operational manual, continuously accumulating and optimizing strategies over time [13]. Group 2: Roles in ACE - The ACE framework consists of three distinct roles: Generator, Reflector, and Curator [21]. - The Generator creates reasoning trajectories for new queries, revealing effective strategies and common errors [16]. - The Reflector evaluates these trajectories to extract lessons and optimize through iterative processes [17]. - The Curator synthesizes insights into structured context updates, allowing for parallel integration of multiple incremental changes [18]. Group 3: Performance Results - Experimental results indicate that ACE consistently outperforms various baseline models, including Base LLM, ICL, GEPA, and Dynamic Cheatsheet, in both agent and financial analysis scenarios [22]. - In agent testing using AppWorld, ACE showed a significant performance lead of 12.3% over ReAct+ICL and 11.9% over ReAct+GEPA [23]. - In financial analysis, ACE achieved an average accuracy improvement of 10.9% over ICL, MIPROv2, and GEPA when provided with real answers from the training set [26]. Group 4: Efficiency Improvements - ACE demonstrated substantial reductions in adaptive costs, including an 82.3% decrease in adaptive latency and a 75.1% reduction in the number of attempts compared to GEPA in offline tasks [29]. - In online adaptive scenarios, ACE achieved a 91.5% reduction in latency and an 83.6% savings in token input and generation costs compared to Dynamic Cheatsheet [30].