WebDancer

Search documents
阿里发布信息检索Agent,可自主上网查资料,GAIA基准超越GPT-4o | 模型&数据开源
量子位· 2025-06-27 04:40
Core Viewpoint - Alibaba has introduced WebDancer, an autonomous information retrieval agent capable of understanding and navigating the web like a human, enhancing the capabilities of traditional models through multi-step reasoning and tool usage [1][3]. Group 1: WebDancer's Capabilities - WebDancer can perform complex tasks such as web browsing, information searching, and question answering, demonstrating its ability to execute multi-step reasoning [9]. - The model achieved a Pass@3 score of 61.1% on GAIA and 54.6% on WebWalkerQA, outperforming baseline models and some open-source frameworks [4][34]. - WebDancer employs a four-stage training paradigm, which includes data construction, trajectory sampling, supervised fine-tuning, and reinforcement learning to enhance its reasoning and decision-making capabilities [10][28]. Group 2: Training Methodology - The first stage involves constructing browsing data to create complex QA pairs that require multiple interactions, simulating human behavior [12][15]. - The second stage focuses on generating high-quality Thought-Action-Observation trajectories, utilizing a dual-path sampling method for both short and long reasoning chains [20][22]. - The supervised fine-tuning stage integrates these trajectories to teach the model basic task decomposition and tool usage while preserving its original reasoning abilities [25][27]. - The reinforcement learning stage aims to optimize the agent's decision-making and generalization capabilities in real-world web environments [28][30]. Group 3: Performance Analysis - WebDancer's performance was tested on challenging datasets, including BrowseComp in English and Chinese, where it demonstrated robust capabilities in handling difficult reasoning and information retrieval tasks [36]. - The analysis of Pass@1 and Pass@3 metrics indicates that reinforcement learning significantly improves the sampling of correct responses, while consistency in language reasoning models shows notable improvement [38].
通义实验室最新成果WebDancer:开启自主智能Deep Research的新时代
机器之心· 2025-06-12 06:08
Group 1 - The core viewpoint of the article emphasizes the emergence of WebDancer as a significant advancement in autonomous information retrieval, addressing the challenges of data scarcity and training in open environments [5][10][19]. - The article discusses the increasing demand for intelligent agents capable of multi-step reasoning and decision-making across various fields, highlighting the limitations of existing systems [4][5]. - WebDancer's innovative data synthesis strategies, including CRAWLQA and E2HQA, have successfully generated high-quality training datasets to overcome the scarcity of effective data [12][16]. Group 2 - WebDancer employs a two-phase training strategy, consisting of supervised fine-tuning (SFT) and reinforcement learning (RL), to effectively train agents in dynamic open environments [21][22]. - The article details how WebDancer utilizes the DAPO algorithm for dynamic sampling, enhancing data efficiency and the robustness of the agent's strategies [24][25]. - WebDancer's performance is validated through rigorous testing on challenging datasets like GAIA and WebWalkerQA, demonstrating superior capabilities in complex information retrieval tasks [28][30]. Group 3 - Future developments for WebDancer include integrating more advanced tools and expanding its capabilities to handle complex tasks such as web browsing and API calls [41]. - The article outlines plans to broaden the scope of tasks to include long-text writing, which will require enhanced reasoning and generation capabilities [42]. - The focus on open-source models aims to foster a deeper understanding of agentic models and their scalability in dynamic environments [44][45].
阿里智能体多轮推理超越GPT-4o,开源模型也能做Deep Research
量子位· 2025-06-06 04:01
WebDancer团队 投稿 量子位 | 公众号 QbitAI 能够完成多步信息检索任务,涵盖多轮推理与连续动作执行的智能体来了。 通义实验室推出WebWalker(ACL2025)续作自主信息检索智能体WebDancer。 WebDancer 通过系统化的训练范式——涵盖从数据构建到算法设计的全流程——为构建具备长期信息检索能力的智能体提供了明确路径。 同时,该框架也为在开源模型上复现Deep Research系统提供了可行的指导。团队将进一步在更开放的环境中、结合更多工具,持续拓展和 集成Agentic能力,推动通用智能体的落地与演进。 一、背景:信息检索的新需求与挑战 在信息爆炸的时代,传统的搜索引擎已难以满足用户对深层次、多步骤信息获取的需求。从医学研究到科技创新,从商业决策到学术探索,复 杂问题的解决需要深入的信息挖掘和多步推理能力。这催生了对能够自主思考、自主决策的智能体的需求。 然而,构建这样的智能体面临诸多挑战: 二、突破训练数据难获得问题 在自主信息检索领域,高质量的训练数据至关重要。然而,现有的数据集如2WIKI,HotpotQA多为浅层次问题,难以支持复杂多步推理的训 练需求。 数据过滤 ...