UMI
Search documents
训具身模型遇到的很多问题,在数据采集时就已经注定了丨鹿明联席CTO丁琰分享
量子位· 2026-01-08 12:08
Core Viewpoint - The article emphasizes the critical importance of data quality in embodied intelligence, highlighting that many issues arise from the data generation stage rather than the training phase itself [1][7][30]. Group 1: UMI Overview - Universal Manipulation Interface (UMI) is a framework proposed by Stanford in February 2024, designed to decouple robot bodies from human operation behaviors, integrating "operational intent + motion trajectory + multimodal perception" into a universal interface for various robots [5][8]. - UMI has gained traction since September 2023, with companies like Luming Robotics leading the way in this field [6][8]. Group 2: Data Collection Challenges - The cost of data collection for training is exceptionally high, with estimates of $100-200 per hour in the U.S., requiring vast amounts of data (e.g., 270,000 hours for Generalist's GEN 0) to train models comparable to GPT-3, which could cost hundreds of billions of dollars [19][21]. - Data collection efficiency is low, with remote operation yielding only about 35 data points per hour, leading to issues like data silos due to the unique designs of different robots [21][22]. Group 3: FastUMI Pro Product - Luming Robotics has developed FastUMI Pro, a data collection hardware that is lightweight (over 600 grams) yet capable of handling 2-3 kg objects, suitable for both industrial and domestic applications [10][12]. - FastUMI Pro supports multimodal inputs, including tactile, auditory, and six-dimensional force data, and boasts a spatial precision of 1mm, claimed to be the highest globally [11][12]. Group 4: Data Quality and Training Issues - The article discusses the misconception that UMI data collection is simple, emphasizing that high-quality data must meet strict alignment and synchronization criteria across multiple sensors [34][39]. - Many UMI devices fail to produce usable data due to inadequate hardware capabilities, leading to poor image quality and frame rate issues that disrupt the learning process [43][46]. - The distinction between "dirty data" and "waste data" is made, with waste data being unstructured and lacking design, making it unsuitable for training models [50][59]. Group 5: Systemic Approach to UMI - The article argues that UMI requires a systemic approach where hardware, data, and algorithms are interdependent, and any failure in one area can prevent the successful training of models [63][65]. - Luming Robotics aims to break the "impossible triangle" of high-quality data acquisition at low costs to accelerate the development of the embodied intelligence industry [68].