Workflow
数据飞轮机制
icon
Search documents
V-Thinker: 让模型像人一样「边画边想」
机器之心· 2025-12-25 01:20
在上述进展的基础上,我们进一步思考: 模型是否能够像人一样,在推理过程中实现「边画边思考」的视觉推理范式? 为此,我们从数据、训练范式与评 测体系等多个方面,对视觉交互推理进行了系统性探索: 我们提出 V-Thinker ,一个面向视觉交互推理的多模态推理框架。通过冷启动监督微调与强化学习相结合的训练,使模型能够在推理过程中自主生成 代码并与图像交互,从而实现「边画边思考」的视觉推理方式。 在数据层面,我们提出 Data Evolution Flywheel (数据演化飞轮),能够在多样性、质量与难度三个维度上自动合成、演化并校验视觉交互推理数 据,并进一步构建开源了数据集 V-Interaction-400K ,为视觉交互推理和图像到代码转换等任务提供了基础支撑。 在训练层面,我们设计了一套渐进式视觉训练范式,通过构建 V-Perception-40K 首先提升模型的视觉感知能力,再通过结合监督微调与强化学习的 两阶段训练,使模型掌握基于视觉交互的推理能力。 在评测方面,我们构建了 VTBench ,一个面向视觉交互推理场景的专家标注基准。实验结果表明,V-Thinker 在交互式推理与通用推理任务上均有 ...
服装、康养、物流三大赛道,或成为具身智能机器人落地先行区
机器人大讲堂· 2025-08-26 11:56
Core Viewpoint - The integration of artificial intelligence and robotics is entering a critical phase, with embodied intelligent robots moving from laboratory settings to industrial applications, driven by advancements in "brain" technology, the resolution of contextual challenges, and rigid demands in specific sectors [1] Group 1: Evolution and Breakthrough of Robot "Brain" - The core competitiveness of embodied intelligent robots lies in the maturity of their "brain" systems, which directly influences their perception, decision-making, and execution capabilities in complex environments [2] - Recent advancements have transitioned robot intelligence from single-modal processing to multi-modal integration, creating a complete technological chain from basic models to comprehensive applications [2][4] - The emergence of visual language models (VLM) has significantly enhanced robots' perception capabilities, allowing them to understand and interact with their environments more effectively [4] - The latest visual language action models (VLA) have integrated motion control into intelligent systems, achieving a closed-loop from perception to action, thus improving operational precision and safety in human-robot collaboration [4][5] Group 2: From Technical Bottlenecks to Scene Implementation - The industrialization of general-purpose robots has been hindered by three main bottlenecks: lack of real machine data, slow model inference, and complex motion control [6] - Focusing on vertical fields provides new pathways to overcome these challenges, facilitating the transition of robots from labs to large-scale applications [6] - The establishment of a "data flywheel" mechanism is crucial for accumulating the necessary 3D spatial and physical interaction data, enabling robots to improve performance through iterative deployment [6][9] - Recent advancements have reduced deployment cycles from 18 months to 6 months and cut deployment costs by 50%, with task success rates increasing from 60% to over 90% [9] Group 3: Key Application Scenarios - The report identifies three key sectors for the application of embodied intelligent robots: apparel, healthcare, and logistics, which are experiencing a pivotal shift from technology validation to large-scale implementation [11] - In the apparel industry, automation bottlenecks have historically limited upgrades, but recent technological breakthroughs are expected to increase automation rates in sewing from 5% to 50% within 3-5 years [11][13] - The healthcare sector faces a significant shortage of caregivers, and robots are being developed to assist in patient care, with government policies supporting the trial of intelligent elderly care robots [13][14] - The logistics industry is focusing on automating the last mile of operations, with embodied intelligent robots addressing the labor-intensive task of picking and sorting, which still relies heavily on human labor [14][16] Group 4: Future Industry Ecosystem and Investment Opportunities - Investment opportunities are emerging in the intelligent robotics sector, particularly in the integration of small and precise models for specific applications, as well as in the development of intelligent sewing equipment in the apparel industry [16][17] - The healthcare robotics field is characterized by multiple technological pathways, with companies exploring various applications in rehabilitation and elderly care [17] - In logistics, the focus is on automated system integration, with companies developing comprehensive solutions that enhance efficiency in material handling and sorting processes [19] - The long-term significance of embodied intelligent robots lies in their potential to redefine production and service paradigms, leading to a new phase of productivity growth in manufacturing and service industries [19]