Workflow
WRC整理床铺机器人背后模型曝光!端到端双系统全身智能VLA,仅凭少量微调就能get任务
量子位·2025-08-11 10:12

Core Insights - The article discusses the launch of the G0 model by Xinghai Map, which enables robots to autonomously and continuously perform tasks like bed-making with minimal fine-tuning [1][3] - The G0 model integrates a large-scale open-world dataset with a dual-system architecture to enhance the robot's adaptability in unstructured environments [3][4] Dataset Overview - The Galaxea Open-World Dataset is the first high-quality dataset collected in real human environments, covering 50 types of settings and totaling 500 hours of high-quality mobile operation data [7][9] - The dataset includes over 150 tasks, 1600+ objects, and 58 skills, emphasizing diversity and complexity in task distribution [10][12] - Data collection utilized a unified robotic platform, ensuring consistent action space and perception inputs across various tasks and scenarios [8][9] Model Architecture - The G0 model employs a dual-system structure, separating high-level reasoning and low-level action execution into G0-VLM and G0-VLA modules [18][19] - G0-VLM processes natural language task instructions and breaks them down into executable subtasks, while G0-VLA executes actions at a high frequency [19] Training Strategy - The training of the G0 model is divided into three phases: cross-platform pre-training, single-platform pre-training, and task-specific post-training [21][23] - This phased approach enhances the model's performance and adaptability to complex skills, particularly in long-sequence tasks [21][23] Performance Evaluation - The G0 model outperformed the π0 benchmark model in various tasks, demonstrating significant advantages in object manipulation and language instruction following [22][30] - The model achieved over 50% improvement in accuracy compared to baseline models, particularly after specialized training [32] Conclusion - The combination of the Galaxea Open-World Dataset and the G0 dual-system VLA model provides a scalable and high-fidelity approach for training and deploying embodied intelligence [33] - The open-sourcing of data and models aims to bridge technological gaps and accelerate the transition of embodied intelligence from laboratory innovations to societal applications [33]