SIGIR 2025 | 解决扩展和迁移难题，华为新加坡提出InstructRAG，提升高达19%

Core Viewpoint - The article discusses the InstructRAG framework, which leverages Retrieval-Augmented Generation (RAG) to enhance task planning capabilities of large language models (LLMs) by addressing scalability and transferability challenges [1][2][30]. Group 1: Challenges in Task Planning - Scalability is defined as the ability to expand the instruction graph by combining existing instructions into new sequences, enabling LLMs to tackle tasks without predefined paths [1][2]. - Transferability involves developing technologies that allow models to quickly adapt to new tasks and learn effectively from limited examples [2]. Group 2: InstructRAG Framework Components - The InstructRAG framework consists of three main components: 1. Instruction Graph, which organizes past instruction paths [4]. 2. RL-Agent, a reinforcement learning agent that expands the graph coverage [4]. 3. ML-Agent, a meta-learning agent that enhances task generalization capabilities [4]. Group 3: Instruction Graph - The Instruction Graph is a directed graph that organizes past instruction paths, where nodes represent instruction sets and edges represent tasks [6]. Group 4: RL-Agent Functionality - The RL-Agent operates as a Markov Decision Process (MDP) to select nodes in the instruction graph, effectively exploring its scalability [7]. - It utilizes state, action, reward, and policy learning to optimize the selection of instruction paths [8]. Group 5: ML-Agent Functionality - The ML-Agent enhances transferability by selecting relevant paths from the RL-Agent's candidates and generating prompts for LLMs [9]. - Its training involves pre-training and fine-tuning phases to optimize performance [10][11]. Group 6: Overall Framework and Training - The overall framework includes training, few-shot learning, and testing phases, enhancing scalability through the RL-Agent and transferability through the ML-Agent [13][16]. Group 7: Experimental Results - InstructRAG demonstrated superior performance across multiple datasets, achieving a 19.2% improvement over the best baseline method in various tasks [22][30]. - The framework showed strong generalization capabilities when applied to unseen tasks, maintaining effectiveness with limited examples [23][28]. Group 8: Robustness and Component Importance - InstructRAG exhibited robust performance against noise, with only an 11.1% performance drop at 50% noise, compared to a 27.2% drop for the baseline [25]. - Each component of InstructRAG significantly contributes to its performance, as evidenced by ablation studies [26][27]. Group 9: Future Directions - Future work will focus on further enhancing the generalization capabilities of InstructRAG [30].