Workflow
具身数据
icon
Search documents
深扒了具身的数据路线,四小龙的格局已经形成......
具身智能之心· 2025-12-24 10:04
Core Viewpoint - The development of embodied intelligence over the past 25 years has focused on a closed-loop process of data collection, model training, data scaling, and model optimization, with data remaining a key focus for future advancements [1][5]. Group 1: Data Routes - The industry is not selecting a single optimal solution but is progressing along four distinct data routes simultaneously, each addressing different constraints and stages [3]. - The four data routes have led to the emergence of a competitive landscape termed the "Four Little Dragons of Embodied Data," with key players including Zhiyuan, Galaxy, Tashi, and Luming [4][34]. Group 2: Data Route Descriptions - **Remote Control Real Machine**: This route provides the most authentic data but is also the most expensive and slow, requiring real robots and specialized operators, making it difficult to scale [8][12][14]. - **Simulation Data**: Offers high efficiency and scalability, but faces challenges due to the domain gap, limiting its effectiveness in real-world applications [16][18][20]. - **Human Video**: This route is cost-effective and covers a wide range of scenarios but lacks critical feedback mechanisms and is not a primary data source for initial capabilities [22][25]. - **UMI Data**: This approach decouples real interaction data from specific robots, allowing for more versatile and scalable data collection, thus becoming a foundational infrastructure for embodied data [27][30][31]. Group 3: Industry Practices - In the remote control real machine data direction, Tesla is advancing its remote operation system, while Zhiyuan Robotics is deepening its focus on real bodies and task loops [35]. - In the simulation data route, Galaxy General is expanding synthetic data scale through computational power and simulation engines [35]. - In the human video data direction, Tashi is developing large-scale human behavior video datasets to enhance semantic coverage [35]. - The UMI route is represented by Luming Robotics, which has made significant strides in scaling and engineering UMI data collection systems [35][39]. Group 4: Future Implications - As the industry transitions from proving feasibility to continuous evolution, the ability to consistently produce high-quality real data will become increasingly critical [37]. - The four data routes are not mutually exclusive; they each play distinct roles in the overall ecosystem, contributing to a clearer path forward for embodied intelligence [38][40]. - The importance of time accumulation is emphasized, particularly for the UMI route, which relies heavily on early choices and sustained investment [41][42]. - The current landscape of the "Four Little Dragons" serves as a structural description of the industry, with future success dependent on which routes and teams can maintain operational continuity and data advantages [44][45].
圆桌论坛:具身数据如何塑造行业未来?丨GAIR 2025
雷峰网· 2025-12-21 03:05
Core Viewpoint - The article discusses the current state and future potential of embodied intelligence, particularly focusing on the challenges and opportunities in data collection methods and the maturity of the industry [2][25]. Data Quality and Collection - High-quality data is becoming a bottleneck for breakthroughs in embodied intelligence performance and cost control [2]. - The roundtable forum at the GAIR conference emphasized the importance of data quality, with consensus that the effectiveness of models and the benefits to robots depend on the quality of data collected [2][4]. - Different data collection methods, such as UMI (Universal Manipulation Interface), remote operation, motion capture, and simulation data, are being explored, with a focus on adapting to various scenarios and hardware [3][4]. Industry Maturity and Challenges - The data collection industry is still in its early stages, with companies in data, embodiment, and modeling still aligning their needs and capabilities [4][7]. - There is a lack of a unified approach in the industry, with varying demands from different model companies, indicating that data companies need to understand models and provide suggestions to improve collaboration [7][8]. - The current focus on data collection methods is shifting, with a notable rise in interest in remote operation and UMI, particularly influenced by developments in North America [9][10]. In-the-Wild Data Collection - In-the-wild data collection is seen as a challenging yet promising approach, requiring advanced technical capabilities and effective management of hardware and software [3][21]. - The article highlights the need for low-friction, high-precision, and multi-modal data collection devices to effectively utilize in-the-wild data [3][21]. - The maturity of in-the-wild data collection is still developing, with current efforts primarily focused on improving data collection technology before addressing human resource management [21][22]. Government Support and Industry Dynamics - Government support for data collection factories is prevalent in China, which may influence the direction of data collection methods and industry growth [10][17]. - The article suggests that while government-backed data collection initiatives can stimulate the industry, they may not always align with the most effective technological advancements [17][18]. - The cost structure of data collection is critical, with significant portions attributed to equipment depreciation and labor costs, indicating a need for strategic investment in data collection methods [19][20]. Future Outlook - The industry is expected to evolve, with a potential shift towards more diverse data collection methods as companies adapt to changing demands and technological advancements [18][19]. - The article expresses skepticism about the current maturity of the embodied intelligence industry compared to other tech sectors, suggesting that significant challenges remain before widespread adoption can occur [25][26]. - Companies are encouraged to collaborate and share insights to enhance data collection processes and improve overall industry knowledge [28][30].