Agentic RAG
Search documents
写在 Manus“卖身”后:企业级 Agent 只会更像软件,而非魔法
AI前线· 2025-12-31 04:33
作者 | 水中刀 12 月 30 日 Meta 正式宣布与 Manus 达成全资收购协议,收购金额高达数十亿美元,成为 Meta 成 立以来,仅次于 WhatsApp(190 亿美元)和 Scale AI(未披露具体金额)的第三大收购案。 Manus 创始人肖弘将出任 Meta 副总裁,其核心技术团队将整体并入 Meta AI 部门,公司将继续在 新加坡独立运营,现有订阅服务保持不变。肖弘在社交媒体上发文感慨: "今天是我一生都难以忘却的时刻。" 无论从产品还是营销层面来看,Manus 都堪称生成式 AI 时代的典型创业案例,但也有业内专家表 示,看来做一家独立公司很难,通用级 Agent 终究还是巨头的游戏。 这样的评价,也源于 Agent 想要落地企业,本身就存在许多工程交付与产品优化问题,而这一般来 说是"大厂专利"。就 ToB 场景而言,企业级 Agent 的使用频率,与资本、品牌市场层面的"表象繁 荣"是有些许错位的,创业公司很难组建庞大的工程师团队围绕细分场景做长线研发,遑论贴身服务 客户,在价值上形成闭环。 这些所谓的工程交付与产品优化问题,大概可分为以下几类: 幻觉:模型带来的幻觉问题,或许无 ...
搜索智能体RAG落地不佳?UIUC开源s3,仅需2.4k样本,训练快效果好
机器之心· 2025-06-17 00:10
Core Insights - The article discusses the emergence of Agentic RAG (Retrieval-Augmented Generation) as a key method for large language models to access external knowledge, highlighting the limitations of current reinforcement learning (RL) training methods in achieving stable performance [1][8]. Group 1: Development of RAG Systems - The evolution of RAG systems is categorized into three stages: Classic RAG, Pre-RL-Zero Active RAG, and RL-Zero stage, with each stage introducing new methodologies to enhance retrieval and generation capabilities [7][8]. - The RL-based methods, while promising, face challenges such as misalignment of optimization goals with actual downstream tasks and the coupling of retrieval and generation processes, which complicates performance evaluation [9][12]. Group 2: Limitations of Current RL Methods - Current RL methods like Search-R1 and DeepRetrieval focus on Exact Match (EM) as a reward metric, which can lead to suboptimal training outcomes due to its strictness and insensitivity to semantic variations [9][10]. - The coupling of retrieval and generation in training can obscure the true performance improvements, making it difficult to discern whether gains are due to better search or enhanced language generation [11][12]. - Existing evaluation metrics fail to accurately measure the contribution of search quality to overall performance, leading to bottlenecks in assessment, training, and generalization [14]. Group 3: Introduction of s3 Framework - The s3 framework, proposed by UIUC and Amazon, aims to improve training efficiency and effectiveness by decoupling the search and generation processes, focusing solely on optimizing the searcher with a new reward function called Gain Beyond RAG (GBR) [1][17]. - s3 demonstrates significant efficiency, requiring only 2.4k training samples and achieving superior performance compared to larger baseline models, with a total training time of just 114 minutes [21][22][25]. Group 4: Experimental Results - In general QA tasks, s3 outperformed both Search-R1 and DeepRetrieval across multiple datasets, showcasing its strong generalization capabilities [23][25]. - In medical QA tasks, s3 exhibited remarkable cross-domain performance, indicating its robustness and adaptability to different datasets and contexts [26][27]. Group 5: Design and Optimization Insights - The design of s3 emphasizes the importance of starting retrieval from the original query, which helps maintain focus and improves search outcomes [31]. - The document selection mechanism within s3 significantly reduces token consumption, enhancing efficiency and minimizing noise in the generation process [31][30].