具身数据
Search documents
深扒了具身的数据路线,四小龙的格局已经形成......
具身智能之心· 2025-12-24 10:04
Core Viewpoint - The development of embodied intelligence over the past 25 years has focused on a closed-loop process of data collection, model training, data scaling, and model optimization, with data remaining a key focus for future advancements [1][5]. Group 1: Data Routes - The industry is not selecting a single optimal solution but is progressing along four distinct data routes simultaneously, each addressing different constraints and stages [3]. - The four data routes have led to the emergence of a competitive landscape termed the "Four Little Dragons of Embodied Data," with key players including Zhiyuan, Galaxy, Tashi, and Luming [4][34]. Group 2: Data Route Descriptions - **Remote Control Real Machine**: This route provides the most authentic data but is also the most expensive and slow, requiring real robots and specialized operators, making it difficult to scale [8][12][14]. - **Simulation Data**: Offers high efficiency and scalability, but faces challenges due to the domain gap, limiting its effectiveness in real-world applications [16][18][20]. - **Human Video**: This route is cost-effective and covers a wide range of scenarios but lacks critical feedback mechanisms and is not a primary data source for initial capabilities [22][25]. - **UMI Data**: This approach decouples real interaction data from specific robots, allowing for more versatile and scalable data collection, thus becoming a foundational infrastructure for embodied data [27][30][31]. Group 3: Industry Practices - In the remote control real machine data direction, Tesla is advancing its remote operation system, while Zhiyuan Robotics is deepening its focus on real bodies and task loops [35]. - In the simulation data route, Galaxy General is expanding synthetic data scale through computational power and simulation engines [35]. - In the human video data direction, Tashi is developing large-scale human behavior video datasets to enhance semantic coverage [35]. - The UMI route is represented by Luming Robotics, which has made significant strides in scaling and engineering UMI data collection systems [35][39]. Group 4: Future Implications - As the industry transitions from proving feasibility to continuous evolution, the ability to consistently produce high-quality real data will become increasingly critical [37]. - The four data routes are not mutually exclusive; they each play distinct roles in the overall ecosystem, contributing to a clearer path forward for embodied intelligence [38][40]. - The importance of time accumulation is emphasized, particularly for the UMI route, which relies heavily on early choices and sustained investment [41][42]. - The current landscape of the "Four Little Dragons" serves as a structural description of the industry, with future success dependent on which routes and teams can maintain operational continuity and data advantages [44][45].
圆桌论坛:具身数据如何塑造行业未来?丨GAIR 2025
雷峰网· 2025-12-21 03:05
" 百亿级投入,具身行业够成熟了吗? " 作者丨 梁丙鉴 编辑丨马晓宁 高质量数据正在成为具身本体性能突破和成本控制的瓶颈。在具身智能从技术演示走向规模落地的关键转 折期,对于数据的需求和争论也变得越发火热。从遥操作到UMI,从动捕到仿真数据,具身数据的未来在 数采工厂,还是名为In-the-wild的美好愿景? 2025年12月13日,第八届GAIR大会的数据&一脑多形专场,举办了主题为具身数据的圆桌论坛。圆桌主 持人为英诺天使基金 ED,石麻笔记主理人王建明,并邀请了诺亦腾机器人创始人戴若犁,极数迭代 CEO、深圳AIRS访问研究员佟显乔,鹿明机器人CTO丁琰,共同围绕具身数据的质量、采集以及数据飞 轮等议题,展开了一场深度对话。 具体而言,在采集阶段需要低摩擦、高精度、多模态的数采设备,野采数据的利用,还需要从稀疏原始数 据中得到稠密信息的技术方案。戴若犁认为,一条可行的链路是通过世界模型进行先验估计,输出更丰富 的模态及维度数据。相较之下,远未到比拼人力组织能力的时间。 佟显乔认为,数据采集行业仍处于早期阶段,数据、本体、模型公司仍在相互磨合。不同的模型公司提出 了不同的需求,这意味着数据公司不能停留 ...