Core Insights - The article discusses the launch of Genie Envisioner (GE), the first open-source robot world model platform, by Zhiyuan Robotics, which integrates future frame prediction, strategy learning, and simulation evaluation into a closed-loop architecture centered on video generation [1][2] - The platform aims to enable robots to perform end-to-end reasoning and execution from "seeing" to "thinking" and "acting" within the same world model [1] Summary by Sections Platform Features - GE platform consolidates data collection, model training, and strategy evaluation into a closed-loop system, breaking away from the traditional segmented pipeline [1] - The core component, GE-Base, has been trained on over one million data points to accurately interpret environmental layouts and action intentions [1] - GE-Act action decoder facilitates the critical transition from understanding to execution, while GE-Sim extends the generative capabilities of GE-Base into action-conditioned neural simulation [1] Data Utilization - The platform is built on approximately 3000 hours of real robot operation video data, establishing a direct mapping from language instructions to visual space while preserving the spatiotemporal information of robot-environment interactions [1] Real-World Applications - Robots equipped with GE-Act have successfully completed tasks such as making sandwiches, pouring tea, and wiping tables in real-world tests [3]
智元机器人发布行业首个机器人世界模型开源平台 实测可完成做三明治、倒茶等任务