智能必须基于世界模型?我们和蚂蚁灵波团队聊了聊
机器之心·2026-02-05 04:35

Core Viewpoint - The article discusses the transition from large language models (LLMs) to a new era of physical AI, emphasizing the need for AI to understand and interact with the real world rather than just processing language [1][2]. Group 1: Physical AI Development - Yann LeCun argues that true intelligence requires the ability to predict and plan, which current LLMs lack [2]. - Ant Group's Robbyant has made significant strides in physical AI by releasing four embodied intelligence models in a short span, showcasing a unique approach to AI development [2][5]. - The company aims to build intelligence from physical interactions, moving beyond the digital realm [3][4]. Group 2: Technological Approach - Ant Group's strategy focuses on using real-world data and internet data for training models, rejecting the prevalent "Sim-to-Real" approach in favor of direct learning from real-world interactions [7][9]. - The LingBot-VLA model, trained on over 20,000 hours of high-quality real machine data, has surpassed several international benchmarks, indicating a significant advancement in robotics technology [9]. - The LingBot-VA model represents a breakthrough in general robot control, utilizing causal video-action world models to predict and act in real environments [10][12]. Group 3: Future Aspirations and Ecosystem - Ant Group envisions creating an open-source ecosystem for robotics, akin to an "Android system" for robots, emphasizing collaboration with data providers to enhance model training diversity [18][19]. - The company is focused on providing efficient post-training tools to help hardware manufacturers adapt their robots to the intelligence developed by Robbyant [19]. - Ant Group's long-term goal is to integrate embodied intelligence into various service sectors, leveraging its strengths in connecting people with services [22][24].