Workflow
空间泛化能力
icon
Search documents
千寻智能高阳团队最新成果:纯视觉VLA方案从有限数据中学到强大的空间泛化能力
机器人大讲堂· 2025-10-04 04:05
Core Viewpoint - The article discusses the introduction of a new strategy called "State-free Policy" in visuomotor control for robots, which enhances spatial generalization capabilities by removing state information from the input, relying solely on visual observations [1][10][24]. Group 1: State-free Policy Overview - The State-free Policy allows robots to perform tasks effectively even when the training data is strictly controlled, such as fixed desktop height and object positions [3][10]. - This policy is based on two key conditions: representing actions in a relative end-effector space and ensuring comprehensive visual input for task observation [11][13]. Group 2: Experimental Results - Extensive experiments demonstrated that State-free Policy significantly improves spatial generalization, achieving a success rate of 0.98 in height generalization and 0.58 in horizontal generalization for the pen insertion task [14][24]. - In challenging tasks like folding clothes and fetching bottles, State-free Policy outperformed state-based models, showcasing superior horizontal generalization capabilities [17][20]. Group 3: Advantages of State-free Policy - State-free Policy exhibits higher data utilization efficiency, maintaining performance even with limited data, unlike state-based strategies that tend to overfit [20][21]. - The policy also shows advantages in cross-platform adaptation, requiring less adjustment compared to state-based strategies, leading to faster convergence and higher success rates [21][22]. Group 4: Sensor Design Considerations - The research suggests reevaluating sensor designs, particularly the necessity of overhead cameras, as they may introduce performance issues due to changes in object positions [22][23]. - The findings indicate that using dual wide-angle wrist cameras can provide sufficient task observation without the overhead camera, maintaining high success rates in various scenarios [23].
千寻智能高阳团队最新成果:纯视觉VLA方案从有限数据中学到强大的空间泛化能力
机器之心· 2025-09-29 02:52
设想一下刚学开车的情况:在训练场上,我们可能会反复练习特定动作:到了某个位置就踩刹车,拐到某个点就打方向盘。久而久之,这些动作会形成 "条件记 忆",一旦环境发生变化,就容易手忙脚乱。最近,千寻智能的研究人员注意到,基于模仿学习的视觉运动策略中也存在类似现象,并在论文《Do You Need Proprioceptive States in Visuomotor Policies?》中对此进行了深入探讨。 论文链接:https://arxiv.org/abs/2509.18644 项目主页:https://statefreepolicy.github.io 文中研究人员提出了一种名为 State-free Policy 的策略,与 State-based Policy 相比,即便在训练数据中桌面高度、机器人位置和目标物体等都被严格固定的情况 下,机器人仍能展现出强大的空间泛化能力。例如: 在夹笔任务中,获得桌面高度的泛化能力(标准桌高为 80 cm): 在叠衣服任务中,即使机械臂位置大幅偏离标准位置,机器人仍然能出色完成任务: 在全身机器人从冰箱拿饮料的过程中,即使冰箱位置发生移动,机器人也能够适应: 事实上 ...