搜索智能体
Search documents
告别「一条路走到黑」:通过自我纠错,打造更聪明的Search Agent
机器之心· 2025-11-18 05:08
Core Insights - The article discusses the emergence of Search Agents to address the challenges of real-time knowledge and complex reasoning, highlighting their ability to interact with search engines for task execution [2][3] - A significant limitation of current Search Agents is their lack of self-correction capabilities, which can lead to cascading errors and task failures [2][3][8] - The ReSeek framework, developed by Tencent's content algorithm center in collaboration with Tsinghua University, introduces a dynamic self-correction mechanism to enhance the reliability of Search Agents [3][8] Group 1: ReSeek Framework - ReSeek is not a simple improvement of RAG but a complete rethinking of the core logic of Search Agents, allowing them to evaluate the effectiveness of each action during execution [3][8] - The framework incorporates a JUDGE action that assesses the validity of new information, enabling the agent to backtrack and explore new possibilities when errors are detected [10][15] - The JUDGE mechanism is designed to provide dense feedback to the agent, guiding it to learn how to accurately evaluate information value [20][39] Group 2: Error Prevention and Performance - The article explains the concept of cascading errors, where a small mistake in early reasoning can lead to a complete task failure [5][14] - The ReSeek framework aims to transform agents from being mere executors to critical thinkers capable of self-reflection and dynamic error correction [8][12] - Experimental results indicate that ReSeek achieves industry-leading performance, particularly in complex multi-hop reasoning tasks, demonstrating the effectiveness of its self-correction paradigm [29][30] Group 3: Evaluation and Benchmarking - The team constructed the FictionalHot dataset to create a closed-world evaluation environment, eliminating biases from pre-trained models and ensuring a fair assessment of reasoning capabilities [22][27] - ReSeek was tested against various benchmarks, showing significant improvements in performance metrics compared to other models [28][32] - The article highlights the inconsistency in experimental setups across different studies, emphasizing the need for standardized evaluation methods [25][31]
当Search Agent遇上不靠谱搜索结果,清华团队祭出自动化红队框架SafeSearch
机器之心· 2025-10-16 07:34
Core Insights - The article discusses the vulnerabilities of large language model (LLM)-based search agents, emphasizing that while they can access real-time information, they are susceptible to unreliable web sources, which can lead to the generation of unsafe outputs [2][7][26]. Group 1: Search Agent Vulnerabilities - A real-world case is presented where a developer lost $2,500 due to a search error involving unreliable code from a low-quality GitHub page, highlighting the risks associated with trusting search results [4]. - The research identifies that 4.3% of nearly 9,000 search results from Google were deemed suspicious, indicating a prevalence of low-quality websites in search results [11]. - The study reveals that search agents are not as robust as expected, with a significant percentage of unsafe outputs generated when exposed to unreliable search results [12][26]. Group 2: SafeSearch Framework - The SafeSearch framework is introduced as a method for automated red-teaming to assess the safety of LLM-based search agents, focusing on five types of risks including harmful outputs and misinformation [14][21]. - The framework employs a multi-stage testing process to generate high-quality test cases, ensuring comprehensive coverage of potential risks [16][19]. - SafeSearch aims to enhance transparency in the development of search agents by providing a quantifiable and scalable safety assessment tool [37]. Group 3: Evaluation and Results - The evaluation of various search agent architectures revealed that the impact of unreliable search results varies significantly, with the GPT-4.1-mini model showing a 90.5% susceptibility in a search workflow scenario [26][36]. - Different LLMs exhibit varying levels of resilience against risks, with GPT-5 and GPT-5-mini demonstrating superior robustness compared to others [26][27]. - The study concludes that effective filtering methods can significantly reduce the attack success rate (ASR), although they cannot eliminate risks entirely [36][37]. Group 4: Implications and Future Directions - The findings underscore the importance of systematic evaluation in ensuring the safety of search agents, as they are easily influenced by low-quality web content [37]. - The article suggests that the design of search agent architectures can significantly affect their security, advocating for a balance between performance and safety in future developments [36][37]. - The research team hopes that SafeSearch will become a standardized tool for assessing the safety of search agents, facilitating their evolution in both performance and security [37].
搜索范式革命:纳米AI与谷歌的「超级搜索智能体」共识
36氪· 2025-06-12 11:27
Core Viewpoint - The article discusses the evolution of search engines into "super search" intelligent agents by 2025, emphasizing their transition from traditional keyword-based searches to task-oriented engines that understand user intent and deliver actionable solutions [2][8][16]. Group 1: Evolution of Search Engines - The concept of "super search" is moving from theory to reality, with search engines evolving to possess both intent understanding and task execution capabilities [2][3]. - The AI search 1.0 era involved traditional web page ranking with AI enhancements, while AI search 2.0 transitioned to answer engines focused on delivering direct answers [5][8]. - By 2025, AI search 3.0 will enable a closed-loop system where user intent input leads to automatic execution and result delivery, fundamentally changing how users interact with search engines [8][16]. Group 2: Capabilities of Super Search - Super search must incorporate five key capabilities: task planning, multi-model collaboration, high-dimensional information recognition, multi-modal output, and personalized search experiences [9][10][11][12][13]. - Current AI search engines are still in the early stages of development, with notable examples like Nano AI and Google's AI Mode demonstrating varying degrees of these capabilities [14][18]. Group 3: Market Position and Competition - Nano AI has emerged as a leader in the AI search engine market, achieving significant user engagement and outperforming competitors like Perplexity and traditional search engines [19][21]. - The competition in the search engine space is shifting towards more open agent product designs, with companies like Google leveraging their established technology and brand, while Nano AI focuses on rapid innovation and user-centric product development [33][34].
搜索Agent最新高效推理框架:吞吐量翻3倍、延迟降至1/5,还不牺牲答案质量丨南开& UIUC研究
量子位· 2025-05-29 01:08
Core Insights - The article discusses the efficiency challenges faced by AI-driven search agents, particularly those powered by large language models (LLMs), and introduces a new framework called SearchAgent-X that significantly enhances performance [1][3][32]. Efficiency Bottlenecks - The research identifies two main efficiency bottlenecks in search agents: retrieval accuracy and retrieval latency [4][8]. - Retrieval accuracy is not a straightforward relationship; both low and high precision can negatively impact efficiency. Low precision leads to increased rounds of retrieval, while high precision consumes excessive computational resources [5][6][7]. - Search agents benefit from high recall rate approximate searches, which support reasoning without incurring unnecessary costs [7]. Latency Issues - Search agents are highly sensitive to retrieval latency, where even minor increases can lead to significant end-to-end delays, sometimes up to 83 times [11]. - Improper scheduling and retrieval stalls are identified as primary causes of latency, with data showing that up to 55.9% of tokens may be unnecessarily recomputed due to scheduling issues [13]. SearchAgent-X Framework - SearchAgent-X employs two main acceleration mechanisms: priority-aware scheduling and non-stall retrieval [14][16]. - Priority-aware scheduling dynamically prioritizes concurrent requests to minimize unnecessary waiting and redundant computations [17][18]. - Non-stall retrieval allows for flexible, non-blocking searches, enabling early termination of retrieval when results are deemed sufficient [19][20][22]. Performance Improvements - In practical tests, SearchAgent-X demonstrated a throughput increase of 1.3 to 3.4 times and reduced average latency to 20% to 60% of baseline systems [27]. - The framework maintained generation quality comparable to baseline systems, with slight improvements in accuracy observed in some datasets due to the nature of approximate retrieval [28][29]. Technical Contributions - Each optimization component contributes significantly to overall performance, with priority scheduling reducing end-to-end latency by 35.55% and improving cache hit rates [30]. - Non-stall retrieval further enhances cache hit rates and reduces latency, emphasizing the importance of minimizing waiting times in complex AI systems [31]. Future Outlook - The article concludes that future AI systems will require more frequent interactions with external tools and knowledge bases, highlighting the need to address existing efficiency bottlenecks [32][33]. - It emphasizes the importance of balancing the performance of individual tools within the overall workflow of AI agents to avoid compounding delays and inefficiencies [34].