Workflow
深度研究Agent模型
icon
Search documents
阿里开源通义DeepResearch,性能超OpenAI、DeepSeek旗舰模型
Xin Lang Ke Ji· 2025-09-17 03:33
Core Insights - Alibaba has open-sourced its first deep research Agent model, Tongyi DeepResearch, which has achieved state-of-the-art (SOTA) results on multiple authoritative evaluation sets, surpassing models from OpenAI and DeepSeek [1][2] - The model, framework, and solutions of Tongyi DeepResearch are fully available for users to download on platforms like Github, Hugging Face, and Modao Community [1] - The Tongyi team has developed a complete training pipeline driven by synthetic data, addressing challenges such as "cognitive space congestion" and "irreversible noise pollution" that affect long-term task processing [1] Performance Metrics - Tongyi DeepResearch model, with 3 billion activation parameters, outperforms flagship models like OpenAI o3, DeepSeek V3.1, and Claude-4-Sonnet across various benchmarks [2] - In the Humanity's Last Exam benchmark, Tongyi DeepResearch achieved a score of 32.9, significantly higher than competitors such as DeepSeek V3.1 (29.8) and OpenAI o3 (24.9) [2] - The model also excelled in other benchmarks, including BrowseComp-ZH (43.4), GAIA (46.7), and WebWalkerQA (70.9), showcasing its superior performance [2]