Workflow
Unitree Z1
icon
Search documents
宇树:开源机器人世界大模型!
量子位· 2025-09-16 04:05
Core Viewpoint - The article discusses the release of a new open-source model named UnifoLM-WMA-0, which is designed to enhance the interaction between robots and their environments through a world model that understands physical laws [1][9]. Group 1: Model Performance - The model demonstrates effective performance in tasks such as stacking blocks, with predictions closely matching actual operations [3]. - It can also handle more intricate tasks, such as organizing stationery, showcasing its versatility [7]. Group 2: Model Features - UnifoLM-WMA-0 is part of the UnifoLM series, specifically tailored for general robot learning and adaptable to various robotic platforms [9]. - The model's training code, inference code, and checkpoints have been fully open-sourced, quickly gaining over 100 stars on GitHub [11]. Group 3: Training Strategy - The training strategy involved fine-tuning a video generation model using the Open-X dataset to adapt its capabilities to real-world robotic tasks [15]. - The model operates under a dual-function architecture: a decision mode for predicting key information during physical interactions and a simulation mode for generating realistic environmental feedback based on robot actions [20]. Group 4: Dataset Utilization - The training utilized five open-source datasets provided by Unitree Technology, which contributed to the comprehensive training process [22]. - The model excels as a simulation engine, capable of generating controlled interactions based on current scene images and future action commands [23].
宇树开源了UnifoLM-WMA-0: 一个跨实体的世界模型+Action的框架
具身智能之心· 2025-09-16 03:29
Core Insights - The article discusses the launch of UnifoLM-WMA-0, an open-source world model-action architecture developed by Yushu Technology, designed for general robot learning across various robotic entities [2][7]. Group 1: Architecture Overview - UnifoLM-WMA-0 integrates a world model that operates in two modes: decision-making mode, which predicts future physical interactions to assist in action generation, and simulation mode, which generates high-fidelity environmental feedback based on robot actions [7]. - The architecture's core component is a world model that enables robots to understand physical interactions with their environment, providing a simulation engine for generating synthetic data and enhancing decision-making performance [2][7]. Group 2: Model Training and Data - The model has been fine-tuned on the Open-X dataset to adapt its video generation capabilities for robotic operation scenarios, using images and text instructions as inputs to generate videos of future interactions [11]. - UnifoLM-WMA-0 has been trained on five open-source datasets from Unitree, demonstrating interactive controllable generation based on current images and future robot actions [11][13]. Group 3: Available Resources - The article provides links to the complete datasets and models available on the official website, including various configurations of UnifoLM-WMA-0 fine-tuned for different tasks [13][14]. - Specific datasets linked to Unitree robots are also listed, showcasing the diversity of training scenarios available for the model [14].