机器人学习现状!Physical Intelligence内部员工分享(从数采到VLA再到RL)
具身智能之心·2025-12-20 16:03

Core Insights - The article discusses the current state of robot learning as of December 2025, emphasizing that most systems rely on behavior cloning (BC) and the challenges associated with it [5][40][39] - It highlights the importance of data collection from human demonstrations and the limitations of existing methods in achieving robust performance in real-world applications [6][10][12] Group 1: Behavior Cloning and Its Challenges - As of December 2025, all robot learning systems primarily utilize behavior cloning, where human demonstrations are used to train models to mimic actions [5] - The data for behavior cloning comes from human demonstrations and various other sources, but the need for extensive data collection poses significant challenges [7][10] - The limitations of behavior cloning include the inability to generalize well to out-of-distribution (OOD) states, leading to performance degradation in real-world scenarios [16][23][40] Group 2: Data Collection Methods - Data collection methods include using human operators with smart demo gloves and video platforms to gather diverse task execution data [11][13] - The challenges in data collection include ensuring the data is representative of the tasks and the need for extensive training for operators to provide usable data [9][10] - The article emphasizes the importance of high-quality data for training models and the difficulties in achieving this at scale [10][19] Group 3: Future Directions in Robot Learning - The article predicts that within two years, video model backbones will replace current VLA methods, and within ten years, world models will effectively simulate general open-world interactions [73] - It suggests that traditional simulation and game engines will serve as data generators for world models, emphasizing the continued importance of expert demonstration data [73] - The need for robust Q/V functions that can operate effectively in OOD states is highlighted as a critical area for future research [72]