Workflow
潜在策略引导(LPS)
icon
Search documents
CMU最新!跨实体世界模型助力小样本机器人学习
具身智能之心· 2025-08-12 00:03
Core Viewpoint - The article discusses a novel approach to training visuomotor policies for robots by leveraging existing low-cost data sources, which significantly reduces the need for expensive real-world data collection [2][11]. Group 1: Methodology - The proposed method is based on two key insights: 1. Embodiment-agnostic world model pretraining using optic flow as an action representation, allowing for cross-embodiment data set training followed by fine-tuning with minimal target embodiment data [3][12]. 2. Latent Policy Steering (LPS) method improves policy outputs by searching for better action sequences in the latent space of the world model [3][12]. Group 2: Experimental Results - Real-world experiments showed that combining the policy with a pretrained world model from existing datasets led to significant performance improvements, with 30 demonstrations yielding over 50% relative improvement and 50 demonstrations yielding over 20% relative improvement [3][9]. Group 3: Challenges and Solutions - The article highlights the challenges posed by embodiment gaps in pretraining models across different robots, and emphasizes that world models are more suitable for cross-embodiment pretraining and fine-tuning for new embodiments [11][12].