500万次围观,1X把「世界模型」真正用在了机器人NEO身上
机器之心·2026-01-14 01:39

Core Viewpoint - The article discusses the advancements in the home humanoid robot NEO, particularly the introduction of its new brain, the 1X World Model, which enables NEO to learn and perform tasks more autonomously by understanding the physical world through video training [3][4][11]. Group 1: Technological Advancements - NEO has evolved from merely executing pre-programmed actions to being able to "imagine" tasks by generating a video of successful task completion in its mind before executing it [4][6]. - The 1X World Model (1XWM) integrates video pre-training to allow NEO to generalize across new objects, movements, and tasks without extensive prior training data [11][21]. - The model is built on a 14 billion parameter generative video model, which has undergone a multi-stage training process to adapt to NEO's physical characteristics [16][18]. Group 2: Training and Evaluation - The training process includes using 900 hours of first-person human video data to align the model with human-like operational behaviors, followed by fine-tuning with 70 hours of robot data [18][19]. - The evaluation of 1XWM's capabilities shows that it can perform tasks it has never encountered before, with generated videos closely matching real-world execution [24][30]. - The importance of high-quality subtitles and first-person data in improving video generation quality and task success rates is emphasized, indicating that detailed descriptions enhance the model's performance [39][40]. Group 3: Practical Applications - NEO has been tested on various tasks, including those requiring complex interactions and coordination, demonstrating its ability to adapt and learn from video pre-training [28][30]. - The model's performance in both in-distribution and out-of-distribution tasks shows a stable success rate, although some fine manipulation tasks remain challenging [30][32]. - The article suggests that the quality of generated videos can be linked to task success rates, allowing for potential improvements in video generation through iterative testing and selection processes [32][39].

500万次围观,1X把「世界模型」真正用在了机器人NEO身上 - Reportify