Workflow
Agent Foundation Model
icon
Search documents
腾讯AI Lab开源可复现的深度研究智能体,最大限度降低外部依赖
量子位· 2025-08-06 05:56
Core Insights - The article discusses the transformative potential of Deep Research Agents powered by large language models (LLMs) and vision-language models (VLMs) in knowledge discovery and problem-solving [1] - It highlights the limitations of existing open-source agent frameworks that rely on paid tools, which restrict reproducibility and universality [2] Group 1: Cognitive Kernel-Pro Framework - Tencent AI Lab has launched Cognitive Kernel-Pro, a fully open-source, multi-module, hierarchical agent framework that provides a breakthrough solution for the development and training of deep research agents [4] - Cognitive Kernel-Pro outperforms the open-source free framework SmolAgents on the GAIA benchmark suite, with its 8B model surpassing WebDancer and WebSailor-7B on GAIA-text [5] - The framework's technical reports and code have been made available on GitHub, promoting community engagement and reproducibility [8] Group 2: Core Design Features - The framework features a modular architecture with a two-layer design, consisting of a main agent responsible for task decomposition and multiple sub-agents focused on specific tasks, ensuring modular independence and scalability [11] - It incorporates a "Progress State" mechanism for structured state management, enhancing efficiency in handling complex tasks by tracking completed steps and key information [11] - Standardized task interfaces allow communication between the main and sub-agents through simple text interfaces, facilitating collaboration and debugging [11] - The framework employs reflection and voting mechanisms to optimize task completion quality, particularly in high-variability tasks like web browsing [11] Group 3: Innovative Training Methods - Cognitive Kernel-Pro includes a comprehensive training process covering web navigation, file processing, code generation, and reasoning, with a focus on high-quality data construction [16][17] - The training data is enhanced through the use of verifiable query-answer pairs and diverse synthetic queries generated from Persona Hub, improving data quality and robustness [17] - Existing datasets have been refined to align with agent task formats, ensuring relevance to real-world applications [17] Group 4: Performance Advantages - Cognitive Kernel-Pro demonstrates superior performance in web information retrieval, file processing, and complex reasoning tasks, closely approaching the capabilities of paid tool-dependent agent frameworks [19][20] - The framework emphasizes the inherent capabilities of LLMs and VLMs, minimizing external dependencies and achieving true open-source status [20] - Performance comparisons show that Cognitive Kernel-Pro excels in functionality and open-source accessibility compared to existing frameworks [20][22] Group 5: Future Directions - The research team plans to focus on distilling reflection capabilities into a unified agent base model in future work [26]