Workflow
AdaReasoner
icon
Search documents
ICLR 2026 | 7B小模型干翻GPT-5?AdaResoner实现Agentic Vision的主动「视觉工具思考」
机器之心· 2026-02-15 06:46
Core Insights - The article discusses the advancements in multi-modal AI reasoning, particularly focusing on the AdaReasoner model, which excels in tool orchestration for visual reasoning tasks, outperforming larger models like GPT-5 by learning when and how to use tools effectively [2][11]. Group 1: AdaReasoner Overview - AdaReasoner addresses fundamental issues in multi-modal reasoning by treating the decision of what, when, and how to use tools as a reasoning capability [3]. - The model demonstrates significant performance improvements, achieving an average increase of 24.9% across eight benchmarks compared to base models [31]. Group 2: Tool Usage and Learning - AdaReasoner incorporates a training paradigm that allows models to learn tool usage as a general reasoning skill, enabling them to adopt useful tools, discard irrelevant ones, and adjust calling frequency based on task requirements [16][19]. - The model's design includes three key components: Tool Cold Start (TC), Tool-GRPO (TG), and Adaptive Learning (ADL), which enhance its ability to use tools effectively in various scenarios [20][23][25]. Group 3: Performance Metrics - AdaReasoner-7B shows remarkable performance, with significant improvements in structured reasoning tasks, achieving near-perfect scores in several benchmarks [31]. - In specific tasks, such as VSP and Jigsaw, the model's performance improved from base scores to 97.64 and 96.60 respectively, surpassing GPT-5's performance [34]. Group 4: Adaptive Tool Behavior - The model exhibits three adaptive behaviors: adopting useful tools, discarding irrelevant ones, and modulating tool usage frequency based on the context of the task [36][40][44]. - This adaptability allows AdaReasoner to maintain high accuracy while effectively managing tool interactions, demonstrating its capability to learn from reinforcement learning processes [37][41]. Group 5: Generalization and Robustness - AdaReasoner's use of Adaptive Learning enhances its generalization capabilities, allowing it to transfer learned planning abilities to new tasks and agents [53]. - The model's robustness is evidenced by its ability to perform well even when tool definitions and parameters vary, indicating a strong decoupling of tool planning from surface-level text forms [46].