MetaFold
Search documents
IROS 2025 | 机器人衣物折叠新范式,NUS邵林团队用MetaFold解耦轨迹与动作
机器之心· 2025-09-03 00:44
Core Insights - The article discusses the development of MetaFold, a language-guided framework for multi-category garment folding, which aims to enhance the capabilities of robots in manipulating deformable objects like clothing [4][31]. Group 1: Framework Overview - MetaFold addresses the challenges of deformable object manipulation (DOM) by integrating visual and language guidance to improve task execution [3][4]. - The framework employs a hierarchical architecture that separates task planning and action prediction, inspired by the human nervous system [7][37]. - It utilizes a point cloud trajectory generation model that combines geometric features from point clouds with semantic features from language instructions [15][16]. Group 2: Dataset and Methodology - A large dataset was created, consisting of 1,210 garments and 3,376 trajectories, to train the model effectively [10]. - The dataset includes four main folding types: no-sleeve, short-sleeve, long-sleeve, and pants, each with corresponding natural language descriptions [10]. - The trajectory generation model is based on a conditional variational autoencoder (CVAE) and employs a cross-attention mechanism for effective information fusion [15][16]. Group 3: Performance Evaluation - MetaFold demonstrated superior performance in garment folding tasks, achieving a success rate of 79%-97% on unseen datasets, showcasing its strong generalization capabilities [20][22]. - The framework maintained high levels of rectangularity (0.80-0.87) and area ratio (0.24-0.45), outperforming baseline methods [22][23]. - In real-world experiments, MetaFold successfully completed various garment folding tasks, confirming its practical applicability and robustness [26][29]. Group 4: Conclusion and Future Directions - The research presents MetaFold as a significant advancement in robotic manipulation of garments, emphasizing its contributions to performance, generalization, and interpretability [31][37]. - The open-sourcing of the dataset provides valuable resources for future research in the field [11][37].