Workflow
数据困局下的具身智能,谁能率先破局?
机器之心·2025-08-10 01:30

Group 1 - The core issue in embodied intelligence is the severe shortage of real data, with most robotic models relying on less than 1% of real operational data, which limits their generalization capabilities in complex environments [5][6] - There is a debate in the industry regarding the importance of real data versus synthetic simulation data, which affects the scalability and generalization of embodied intelligence [6][7] - Some experts argue that while synthetic data has advantages in cost and scalability, it cannot fully replicate the complexities of the real world, leading to a "domain gap" that hinders model transferability [7][8] Group 2 - The need for hundreds of billions of real data points is highlighted, with current datasets only reaching the million level, presenting a significant bottleneck for the development of embodied intelligence [8] - The strategy of using synthetic data for initial training followed by fine-tuning with real data is seen as a key pathway for the cold start and scaling of embodied intelligence [8][9] - Teleoperation is emerging as a primary method for acquiring real data, especially in the early stages of embodied intelligence, where human operators provide high-quality demonstration actions for training [9][10]