Core Viewpoint - The article discusses the "seesaw" dilemma faced by deep thinking large models, where frequent calls to search tools improve accuracy but lead to increased computational costs and inefficiency. The proposed LightSearcher framework aims to address this issue by introducing an efficient RL optimization technique based on experiential memory, allowing for autonomous optimization of tool usage without relying on additional data [1][9]. Group 1 - The LightSearcher framework maintains accuracy comparable to the SOTA baseline ReSearch while significantly reducing search tool calls by 39.6%, inference time by 48.6%, and token consumption by 21.2% [2]. - The DeepSeek-R1 model can handle complex reasoning tasks, with DeepSearch serving as its core searcher, enhancing reasoning depth and factual reliability by accessing the latest domain-specific knowledge [4]. - High-frequency calls to external search tools can improve real-time information accuracy but lead to significant reasoning delays, with wait times reaching up to several minutes [5][7]. Group 2 - The article identifies existing methods' significant flaws, including reliance on manual labeling, excessive tool calls for simple queries, and a lack of balance between accuracy and efficiency [10][11][12]. - The LightSearcher framework introduces three key components: Contrastive Experiential Reasoning for building a dynamic memory library, Adaptive Reward Shaping to balance accuracy and efficiency, and an RL training mechanism to guide the model in generating efficient trajectories [15][18]. - Experimental results show that LightSearcher achieves top-tier accuracy, with an F1 score of 54.1, and demonstrates strong generalization capabilities across different query difficulties [22][23]. Group 3 - The removal of the experiential component led to a 7.2% drop in F1 score, highlighting its critical role in the framework [24]. - The framework successfully addresses key pain points in existing DeepSearch methods, providing a new pathway for building efficient and reliable deep reasoning systems [26][27]. - LightSearcher is expected to expand beyond multi-hop QA to areas such as code synthesis and strategic planning in the future [26].
经验记忆黑科技!LightSearcher让AI工具调用减39.6%、推理快48.6%
量子位·2025-12-18 09:26