Workflow
RoboGhost
icon
Search documents
智源&悉尼大学等出品!RoboGhost:文本到动作控制,幽灵般无形驱动人形机器人
具身智能之心· 2025-10-27 00:02
Core Insights - The article discusses the development of RoboGhost, an innovative humanoid control system that eliminates the need for motion retargeting, allowing for direct action generation from language input [6][8][14]. Group 1: Research Pain Points - The transition from 3D digital humans to humanoid robots faces challenges due to the cumbersome and unreliable multi-stage processes involved in language-driven motion generation [6][7]. - Existing methods lead to cumulative errors, high latency, and weak coupling between semantics and control, necessitating a more direct path from language to action [7]. Group 2: Technical Breakthrough - RoboGhost proposes a retargeting-free approach that directly establishes humanoid robot strategies based on language-driven motion latent representations, treating the task as a generative one rather than a simple mapping [8][10]. - The system utilizes a continuous autoregressive motion generator to ensure long-term motion consistency while balancing stability and diversity in generated actions [8][14]. Group 3: Methodology - The training process consists of two phases: action generation and strategy training, with the former using a continuous autoregressive architecture and the latter employing a mixture-of-experts (MoE) framework to enhance generalization [11][13]. - The strategy training incorporates a diffusion model that uses motion latent representations as conditions to guide the denoising process, allowing for direct executable action generation [11][14]. Group 4: Experimental Results - Comprehensive experiments demonstrated that RoboGhost significantly improves action generation quality, success rates, deployment time, and tracking errors compared to baseline methods [14][15]. - The results indicate that the diffusion-based strategy outperforms traditional multilayer perceptron strategies in terms of tracking performance and robustness, even when tested on unseen motion subsets [18][19].