多步骤推理与信息融合
Search documents
谷歌智能体发力:增强版Gemini Deep Research和专属API都来了
量子位· 2025-12-12 06:41
Core Insights - OpenAI and Google are both making significant updates in the AI space, with Google launching an enhanced version of Gemini Deep Research aimed at reducing hallucinations and excelling in complex information retrieval and analysis tasks [1][3][10]. Group 1: Gemini Deep Research Enhancements - The enhanced Gemini Deep Research is built on Gemini 3 Pro and will soon be integrated into various Google services such as Google Search, NotebookLM, Google Finance, and the upgraded Gemini App [3][8]. - This version of Gemini Deep Research can perform iterative reasoning, allowing it to generate queries, read and integrate search results, and identify knowledge gaps, significantly improving its web search capabilities [10][12]. - In benchmark tests like HLE, BrowseComp, and DeepSearchQA, the enhanced model has achieved state-of-the-art (SOTA) results, showcasing its superior performance in complex research tasks [10][12]. Group 2: DeepSearchQA Benchmark - Google has released the DeepSearchQA benchmark dataset to provide a more comprehensive evaluation standard for deep search and research tasks, addressing the limitations of existing benchmarks [5][12]. - The dataset includes 900 manually designed causal chain tasks from 17 domains, requiring detailed answer sets, which better measure the model's multi-step reasoning and information fusion capabilities [12]. Group 3: Interactions API - Google has introduced the Interactions API, designed to provide a unified interface for developers to interact with Gemini 3 Pro and Deep Research agents [6][16]. - This API is particularly suited for scenarios requiring multi-step reasoning, tool invocation, and long-term task execution, enhancing the capabilities of existing models [17][18]. - The Interactions API simplifies workflows and adapts better to developer environments by expanding the core capabilities of content generation and supporting server-side state, interpretable data models, and remote tool support [18].