CoRL 2025 | 港大InfoBodied AI团队首发具身表征新范式，构建任务自适应的感知框架

Core Viewpoint - The article introduces HyperTASR, a novel framework for task-aware scene representation in embodied intelligence, enabling robots to dynamically adjust their perception based on task relevance, akin to human cognitive processes [5][12]. Group 1: Research Background and Challenges - In embodied intelligence, strategy learning relies heavily on scene representation, but existing methods often use task-agnostic feature extraction, leading to inefficiencies [4][18]. - Traditional approaches do not adapt to different tasks, resulting in irrelevant information being included in strategy learning, which hampers efficiency and generalization [18][20]. Group 2: Innovations and Contributions - HyperTASR framework allows for task-aware scene representation, enabling robots to focus on relevant environmental features during task execution [8][12]. - Introduces a hypernetwork-driven representation transformation mechanism that dynamically generates adaptive parameters based on task specifications and progress [9][20]. - Compatible with various strategy learning architectures, allowing integration without significant modifications, enhancing performance [10][26]. Group 3: Experimental Validation - Significant improvements were observed in both simulation (RLBench) and real-world environments, establishing new state-of-the-art (SOTA) benchmarks for single-view manipulation tasks [11][29]. - In simulation, integrating HyperTASR with GNFactor and 3D Diffuser Actor led to success rates exceeding baseline methods by 27% and achieving over 80% success in single-view operations, respectively [29][31]. - Real-world experiments demonstrated strong generalization capabilities, achieving a success rate of 51.1% with only 15 demonstration samples per task [32][33].