SIASUN-智元机器人发布行业首个机器人世界模型开源平台实测可完成做三明治、倒茶等任务

Core Insights - The article discusses the launch of Genie Envisioner (GE), the first open-source robot world model platform, by Zhiyuan Robotics, which integrates future frame prediction, strategy learning, and simulation evaluation into a closed-loop architecture centered on video generation [1][2] - The platform aims to enable robots to perform end-to-end reasoning and execution from "seeing" to "thinking" and "acting" within the same world model [1] Summary by Sections Platform Features - GE platform consolidates data collection, model training, and strategy evaluation into a closed-loop system, breaking away from the traditional segmented pipeline [1] - The core component, GE-Base, has been trained on over one million data points to accurately interpret environmental layouts and action intentions [1] - GE-Act action decoder facilitates the critical transition from understanding to execution, while GE-Sim extends the generative capabilities of GE-Base into action-conditioned neural simulation [1] Data Utilization - The platform is built on approximately 3000 hours of real robot operation video data, establishing a direct mapping from language instructions to visual space while preserving the spatiotemporal information of robot-environment interactions [1] Real-World Applications - Robots equipped with GE-Act have successfully completed tasks such as making sandwiches, pouring tea, and wiping tables in real-world tests [3]