机器人训练数据

Search documents
 干家务一小时挣1000元,具身智能时代人类新岗位
 量子位· 2025-10-24 03:53
 Core Insights - The article discusses the rising trend of using household chore videos as high-value training data for humanoid robots, with companies like Encord, Micro1, and Scale AI actively purchasing this content [7][10][19].   Industry Overview - The robotics sector is currently experiencing significant investment, with venture capital in the field reaching $12.1 billion this year alone [10]. - There is a notable data scarcity issue in the robotics industry, as robots require real-world training data that is not readily available like internet datasets for language models [11].   Data Sources - Training data for robots can be sourced from two main paths: real-world data and synthetic data [12]. - Real-world data can be collected through precise equipment that remotely controls robots, capturing detailed physical interactions [12][14]. - Synthetic data is generated in virtual environments, allowing for the creation of numerous action variations at a lower cost [16].   Data Processing Strategies - Companies are combining real and synthetic data to address the scarcity of quality training data, utilizing a small amount of real-world data alongside large volumes of synthetic data [18]. - Encord has reported a fourfold increase in data processing this year compared to last year, with high compensation for high-skill task videos reaching $150 per hour [19].   Market Demand - Demand for training data is coming from companies like Physical Intelligence and Boston Dynamics [22]. - Some startups are even advertising for users to film household chores for as little as $10 to $20 per hour [23].   Data Availability Challenges - Despite efforts from various companies, high-quality training data remains scarce, with the largest available datasets only amounting to about 5,000 hours, which is insufficient for training needs [26].

