这么哇塞的世界模型,竟然是开源的!
量子位·2026-01-29 08:27

Core Viewpoint - Ant Group's LingBot-World represents a significant advancement in the field of embodied intelligence, integrating memory, interactivity, and continuity in a fully open-source world model, which has garnered considerable attention online [12][30]. Group 1: LingBot-World Features - LingBot-World allows continuous generation and interaction for up to 10 minutes, achieving visual effects comparable to DeepMind's Genie 3 but with longer time dimensions [3][11]. - Users can control the perspective in real-time using keyboard and mouse, similar to playing a AAA game, while the agent can autonomously plan and execute actions within the generated world [5][6]. - The model maintains high consistency and memory, allowing it to infer actions of objects even when they are out of view, adhering to real-world physical laws [9][10][11]. Group 2: Technical Innovations - LingBot-World's development involved a mixed data engine, utilizing both real-world videos and synthetic data from Unreal Engine to teach the model causal relationships [16][17]. - The model employs a three-stage evolution strategy, starting with pre-training for video generation, followed by training to understand physical laws, and finally integrating interactive data to enhance memory capabilities [21][24]. - A novel causal attention mechanism and few-step distillation technology were introduced to reduce inference time to under one second, achieving real-time playability at 16 frames per second [26]. Group 3: Strategic Implications - The release of LingBot-World, along with LingBot-Depth and LingBot-VLA, indicates Ant Group's strategic focus on creating a comprehensive infrastructure for embodied intelligence [30][32]. - The integration of perception (LingBot-Depth), decision-making (LingBot-VLA), and simulation (LingBot-World) creates a closed-loop system that enhances the capabilities of robots in virtual environments [41][42]. - This open-source approach aims to provide reusable and standardized infrastructure for various industries, including gaming, AIGC, and autonomous driving, suggesting potential future expansions [43].

这么哇塞的世界模型,竟然是开源的! - Reportify