宇树开源了UnifoLM-WMA-0: 一个跨实体的世界模型+Action的框架

Core Insights - The article discusses the launch of UnifoLM-WMA-0, an open-source world model-action architecture developed by Yushu Technology, designed for general robot learning across various robotic entities [2][7]. Group 1: Architecture Overview - UnifoLM-WMA-0 integrates a world model that operates in two modes: decision-making mode, which predicts future physical interactions to assist in action generation, and simulation mode, which generates high-fidelity environmental feedback based on robot actions [7]. - The architecture's core component is a world model that enables robots to understand physical interactions with their environment, providing a simulation engine for generating synthetic data and enhancing decision-making performance [2][7]. Group 2: Model Training and Data - The model has been fine-tuned on the Open-X dataset to adapt its video generation capabilities for robotic operation scenarios, using images and text instructions as inputs to generate videos of future interactions [11]. - UnifoLM-WMA-0 has been trained on five open-source datasets from Unitree, demonstrating interactive controllable generation based on current images and future robot actions [11][13]. Group 3: Available Resources - The article provides links to the complete datasets and models available on the official website, including various configurations of UnifoLM-WMA-0 fine-tuned for different tasks [13][14]. - Specific datasets linked to Unitree robots are also listed, showcasing the diversity of training scenarios available for the model [14].