重磅!武大提出RGMP框架!泛化成功率87%!数据效率提升5倍!
机器人大讲堂·2025-12-02 09:26

Core Insights - The article discusses the advancements in humanoid robots, particularly focusing on the RGMP framework developed by Wuhan University, which addresses the challenges of data dependency and environmental adaptability in robotic systems [1][3]. Group 1: RGMP Framework Overview - The RGMP framework integrates geometric semantic reasoning with data-efficient visual motion control, achieving a success rate of 87% in generalization tests and improving data efficiency by five times compared to current leading models [3][19]. - The framework consists of two main modules: the Geometric Skill Selector (GSS) and the Adaptive Recursive Gaussian Network (ARGN), which work together to enhance perception and motion capabilities in unfamiliar environments [4][10]. Group 2: Geometric Skill Selector (GSS) - The GSS module incorporates geometric inductive biases into visual language models, allowing robots to select appropriate skills based on visual geometric features and task semantics, significantly reducing tuning costs with only 20 rule-based constraints [4][7]. - In comparative experiments, the GSS demonstrated a skill selection accuracy improvement of 15%-25% over the baseline Qwen-vl model across various complex object manipulation tasks [8][10]. Group 3: Adaptive Recursive Gaussian Network (ARGN) - The ARGN focuses on enhancing data efficiency and modeling spatial relationships through recursive calculations and an adaptive decay mechanism, which helps maintain critical spatial information while amplifying the importance of key task elements [10][13]. - The introduction of a Gaussian Mixture Model (GMM) in ARGN improved the accuracy of motion generation tasks, such as grasping a squashed Coke can, from 0.60 to 0.69, showcasing significant performance enhancement [10][19]. Group 4: Performance Validation - The RGMP framework was tested on humanoid and dual-arm robotic platforms, outperforming traditional models like ResNet50 and Transformer, as well as leading robotic manipulation models such as Maniskill2 and OpenVLA [16][20]. - In a generalization capability test, the RGMP achieved an average grasping accuracy of 0.87 using only 40 training samples, significantly surpassing the 200 samples required by the Diffusion Policy to reach similar accuracy levels [19][20]. Group 5: Future Prospects - The RGMP framework aims to further explore functional generalization capabilities, enabling robots to infer manipulation trajectories for new objects based on learned skills from core objects, thereby reducing reliance on extensive demonstration data [21].

重磅!武大提出RGMP框架!泛化成功率87%!数据效率提升5倍! - Reportify