Workflow
WebDancer
icon
Search documents
阿里发布信息检索Agent,可自主上网查资料,GAIA基准超越GPT-4o | 模型&数据开源
量子位· 2025-06-27 04:40
Core Viewpoint - Alibaba has introduced WebDancer, an autonomous information retrieval agent capable of understanding and navigating the web like a human, enhancing the capabilities of traditional models through multi-step reasoning and tool usage [1][3]. Group 1: WebDancer's Capabilities - WebDancer can perform complex tasks such as web browsing, information searching, and question answering, demonstrating its ability to execute multi-step reasoning [9]. - The model achieved a Pass@3 score of 61.1% on GAIA and 54.6% on WebWalkerQA, outperforming baseline models and some open-source frameworks [4][34]. - WebDancer employs a four-stage training paradigm, which includes data construction, trajectory sampling, supervised fine-tuning, and reinforcement learning to enhance its reasoning and decision-making capabilities [10][28]. Group 2: Training Methodology - The first stage involves constructing browsing data to create complex QA pairs that require multiple interactions, simulating human behavior [12][15]. - The second stage focuses on generating high-quality Thought-Action-Observation trajectories, utilizing a dual-path sampling method for both short and long reasoning chains [20][22]. - The supervised fine-tuning stage integrates these trajectories to teach the model basic task decomposition and tool usage while preserving its original reasoning abilities [25][27]. - The reinforcement learning stage aims to optimize the agent's decision-making and generalization capabilities in real-world web environments [28][30]. Group 3: Performance Analysis - WebDancer's performance was tested on challenging datasets, including BrowseComp in English and Chinese, where it demonstrated robust capabilities in handling difficult reasoning and information retrieval tasks [36]. - The analysis of Pass@1 and Pass@3 metrics indicates that reinforcement learning significantly improves the sampling of correct responses, while consistency in language reasoning models shows notable improvement [38].
通义实验室最新成果WebDancer:开启自主智能Deep Research的新时代
机器之心· 2025-06-12 06:08
Group 1 - The core viewpoint of the article emphasizes the emergence of WebDancer as a significant advancement in autonomous information retrieval, addressing the challenges of data scarcity and training in open environments [5][10][19]. - The article discusses the increasing demand for intelligent agents capable of multi-step reasoning and decision-making across various fields, highlighting the limitations of existing systems [4][5]. - WebDancer's innovative data synthesis strategies, including CRAWLQA and E2HQA, have successfully generated high-quality training datasets to overcome the scarcity of effective data [12][16]. Group 2 - WebDancer employs a two-phase training strategy, consisting of supervised fine-tuning (SFT) and reinforcement learning (RL), to effectively train agents in dynamic open environments [21][22]. - The article details how WebDancer utilizes the DAPO algorithm for dynamic sampling, enhancing data efficiency and the robustness of the agent's strategies [24][25]. - WebDancer's performance is validated through rigorous testing on challenging datasets like GAIA and WebWalkerQA, demonstrating superior capabilities in complex information retrieval tasks [28][30]. Group 3 - Future developments for WebDancer include integrating more advanced tools and expanding its capabilities to handle complex tasks such as web browsing and API calls [41]. - The article outlines plans to broaden the scope of tasks to include long-text writing, which will require enhanced reasoning and generation capabilities [42]. - The focus on open-source models aims to foster a deeper understanding of agentic models and their scalability in dynamic environments [44][45].
阿里智能体多轮推理超越GPT-4o,开源模型也能做Deep Research
量子位· 2025-06-06 04:01
Group 1 - The core viewpoint of the article is the introduction of WebDancer, an advanced autonomous information retrieval agent developed by Tongyi Lab, which addresses the growing demand for multi-step information retrieval capabilities in an era of information overload [1][2][3]. Group 2 - Background: The traditional search engines are insufficient for users' needs for deep, multi-step information retrieval across various fields such as medical research, technological innovation, and business decision-making [3]. - Challenges: Building autonomous agents faces significant challenges, particularly in obtaining high-quality training data necessary for complex multi-step reasoning [4]. Group 3 - Innovative Data Synthesis: WebDancer proposes two innovative data synthesis methods, ReAct framework and E2HQA, to address data scarcity [5][6]. - ReAct Framework: This framework involves a cycle of Thought-Action-Observation, enabling the agent to generate thoughts, take structured actions, and receive feedback iteratively [5]. Group 4 - Training Strategies: WebDancer employs a two-phase training strategy, including Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL), to enhance the agent's adaptability and decision-making capabilities in dynamic environments [12][13]. - Data Quality Assurance: A multi-stage data filtering strategy is implemented to ensure high-quality training data, enhancing the agent's learning efficiency [9][10]. Group 5 - Experimental Results: WebDancer has demonstrated outstanding performance in various information retrieval benchmark tests, particularly excelling in the GAIA and WebWalkerQA datasets [17][18][19]. - Performance Metrics: The best-performing models achieved a Pass@3 score of 61.1% on the GAIA benchmark and 54.6% on the WebWalkerQA benchmark, showcasing their robust capabilities [20]. Group 6 - Future Prospects: WebDancer aims to integrate more complex tools and expand its capabilities to handle open-domain long-text writing tasks, enhancing the agent's reasoning and generative abilities [29][30]. - Emphasis on Agentic Models: The focus is on developing foundational models that inherently support reasoning, decision-making, and multi-step tool invocation, reflecting a philosophy of simplicity and universality in engineering [30][31].