Workflow
机器人高层指挥低层做,“坐标系转移接口”一次演示实现泛化学习 | ICML2025
量子位·2025-07-22 04:35

Core Viewpoint - The HEP (Hierarchical Equivariant Policy via Frame Transfer) framework, developed by Northeastern University and Boston Dynamics RAI, aims to enable AI to adapt to complex real-world scenarios with minimal demonstrations, enhancing efficiency and flexibility in robotic learning [1][4]. Summary by Sections HEP Framework Highlights - The HEP framework efficiently expresses 3D visual information while balancing detail restoration and computational speed [2]. Core Innovations - The framework addresses the long-standing issues of data scarcity and generalization in AI applications by utilizing a hierarchical policy learning framework transfer interface, which allows for strong inductive bias while maintaining flexibility [4]. Simplified and Efficient Hierarchical Structure - The high-level policy sets global objectives, while the low-level policy optimizes actions in a local coordinate system, significantly improving operational flexibility and efficiency [5]. - The model automatically adapts to spatial transformations such as translation and rotation, greatly reducing the dependence on data volume for generalization [5]. Key Concepts - HEP is based on two core ideas: hierarchical policy structure and the "coordinate transfer interface," where the high-level policy provides a "reference coordinate" for the low-level policy to optimize execution details [7]. - The coordinate transfer interface enhances the flexibility of the low-level policy while transmitting the high-level policy's generalization and robustness capabilities [9]. Effectiveness Demonstration - The research team tested the HEP framework on 30 simulated tasks in RLBench, including high-precision and long-duration tasks, and further validated it on three real-world robotic tasks [10]. - The high-level policy predicts a "key pose" for global planning, while the low-level policy generates detailed motion trajectories based on this key pose [11]. Results - The hierarchical strategy shows significant advantages in complex long-range tasks, with the HEP framework learning robust multi-step collaborative tasks with only 30 demonstration data, outperforming non-hierarchical methods [14]. - In the Pick & Place task, HEP achieved 1-shot generalization learning with just one demonstration, significantly improving data efficiency [15]. - The coordinate transfer interface successfully transmits the high-level adaptability to spatial changes to the low-level policy, making the overall strategy easier to extend to new scenarios [16]. - HEP's success rate improved by up to 60% compared to traditional methods under environmental changes and disturbances from unrelated objects [17]. Future Implications - The coordinate transfer interface imposes soft constraints on the low-level policy, ensuring flexibility and providing a natural interface for future integration of multimodal and cross-platform high-level strategies [19].