WIYH数据集
Search documents
深扒了具身的数据路线,四小龙的格局已经形成......
具身智能之心· 2025-12-24 10:04
Core Viewpoint - The development of embodied intelligence over the past 25 years has focused on a closed-loop process of data collection, model training, data scaling, and model optimization, with data remaining a key focus for future advancements [1][5]. Group 1: Data Routes - The industry is not selecting a single optimal solution but is progressing along four distinct data routes simultaneously, each addressing different constraints and stages [3]. - The four data routes have led to the emergence of a competitive landscape termed the "Four Little Dragons of Embodied Data," with key players including Zhiyuan, Galaxy, Tashi, and Luming [4][34]. Group 2: Data Route Descriptions - **Remote Control Real Machine**: This route provides the most authentic data but is also the most expensive and slow, requiring real robots and specialized operators, making it difficult to scale [8][12][14]. - **Simulation Data**: Offers high efficiency and scalability, but faces challenges due to the domain gap, limiting its effectiveness in real-world applications [16][18][20]. - **Human Video**: This route is cost-effective and covers a wide range of scenarios but lacks critical feedback mechanisms and is not a primary data source for initial capabilities [22][25]. - **UMI Data**: This approach decouples real interaction data from specific robots, allowing for more versatile and scalable data collection, thus becoming a foundational infrastructure for embodied data [27][30][31]. Group 3: Industry Practices - In the remote control real machine data direction, Tesla is advancing its remote operation system, while Zhiyuan Robotics is deepening its focus on real bodies and task loops [35]. - In the simulation data route, Galaxy General is expanding synthetic data scale through computational power and simulation engines [35]. - In the human video data direction, Tashi is developing large-scale human behavior video datasets to enhance semantic coverage [35]. - The UMI route is represented by Luming Robotics, which has made significant strides in scaling and engineering UMI data collection systems [35][39]. Group 4: Future Implications - As the industry transitions from proving feasibility to continuous evolution, the ability to consistently produce high-quality real data will become increasingly critical [37]. - The four data routes are not mutually exclusive; they each play distinct roles in the overall ecosystem, contributing to a clearer path forward for embodied intelligence [38][40]. - The importance of time accumulation is emphasized, particularly for the UMI route, which relies heavily on early choices and sustained investment [41][42]. - The current landscape of the "Four Little Dragons" serves as a structural description of the industry, with future success dependent on which routes and teams can maintain operational continuity and data advantages [44][45].
全球首个真实世界具身多模态数据集,它石智航交卷,比特斯拉还早6个月
量子位· 2025-10-10 11:24
Core Viewpoint - The article discusses the launch of the world's first large-scale real-world embodied multimodal dataset, WIYH (World In Your Hands), by the company Itai Zhihang, which integrates vision, language, tactile, and action data for human-centric applications [1][3][5]. Group 1: Dataset Features - WIYH is the first human-centric dataset that includes over 100,000 real human operation videos, covering more than 40 task types and over 100 human skills, utilizing over 13 types of sensors and encompassing more than 520 objects [3][9]. - Each data entry in the dataset contains six types of annotations corresponding to the synchronized multimodal data [4]. - The dataset emphasizes real-world scenarios, capturing human standard operating procedures in various industries, such as hotel laundry and supermarket assembly [9][10][11]. Group 2: Technical Innovations - WIYH represents two major breakthroughs: focusing on real-world scenarios and supporting large-scale multimodal data integration, which provides a solid data foundation for robots to learn complex actions and generalize across different contexts [9][16]. - The dataset features multi-layer annotations, including semantic labeling, depth information, affordance of interactive objects, language reasoning, and tactile/action trajectories, enabling rich and generalized data for embodied intelligence research [12][13]. Group 3: Industry Context - The human-centric data paradigm is gaining consensus in the industry, with companies like Tesla also focusing on real-world data collection for their robotics development [5][6][8]. - Itai Zhihang, established only six months prior, has already secured $242 million in funding, positioning itself ahead of competitors like Tesla in this technological approach [8]. Group 4: Challenges and Opportunities - The article highlights the challenges in obtaining large-scale, real, and generalizable training data for embodied intelligence, noting that traditional data sources like internet videos and simulation data have significant limitations [20][21]. - WIYH fills a gap in cross-industry, real-world data, making it possible to pre-train embodied AI models that are grounded in human experience [26].