Workflow
分层控制
icon
Search documents
中金:具身智能走向数据驱动 高价值信息量成具身智能竞争核心
智通财经网· 2025-11-17 01:37
分层控制是基础架构范式,以两级结构实现工程化;VLA范式(以VLM为基础)强化泛化与交互能力,是 当前活跃的研究方向。世界模型通过环境建模与未来预测提供物理约束,处于科研主导阶段。该行认 为,短期分层架构因工程可控性仍是主流,VLA在复杂任务和人机交互中展现潜力,世界模型因具备 跨设备迁移能力被视为长期方向。 具身智能数据:高价值信息量成竞争核心 机器人数据涵盖多模态,产业找寻低数据成本获取&高数据效率应用路径。1)获取端:包括真机、视频 (第一人称/第三人称)、仿真等路线。2)安全端:数据安全为不容忽视的底线,人形机器人厂商面临权限 隔离、数据加密体系、跨境传输政策等多方挑战。3)应用端:传统数据应用策略为 "同构闭环",仅能在 同类型硬件上复现策略。异构训练通过模块化Transformer架构,跨机器人本体共享算法模型。 具身智能热点议题解析 智通财经APP获悉,中金发布研报称,短期分层架构因工程可控性仍是主流,VLA在复杂任务和人机交 互中展现潜力,世界模型因具备跨设备迁移能力被视为长期方向。机器人数据涵盖多模态,产业找寻低 数据成本获取&高数据效率应用路径。具身智能大脑正处于"路线分化"向"融合落地" ...
只演示一次,机器人就会干活了?北大&BeingBeyond联合团队用“分层小脑+仿真分身”让G1零样本上岗
3 6 Ke· 2025-11-14 02:36
Core Insights - The DemoHLM framework proposed by a research team from Peking University and BeingBeyond offers a novel approach to humanoid robot loco-manipulation, enabling the generation of vast training data from a single human demonstration in a simulated environment, addressing key challenges in traditional methods [1][20]. Group 1: Challenges in Humanoid Robot Loco-Manipulation - Humanoid robot loco-manipulation faces a "triple dilemma" due to limitations in existing solutions, which either rely on simulation or require extensive real-world remote operation data, making them impractical for complex environments like homes and industries [3][6]. - Traditional methods suffer from low data efficiency, poor task generalization, and difficulties in sim-to-real transfer, leading to high costs and limited scalability [6][22]. Group 2: Innovations of DemoHLM - DemoHLM's core innovation lies in its "layered control + single demonstration data generation" approach, ensuring stability in full-body movements while achieving generalization with minimal data costs [7][20]. - The framework employs a hierarchical control architecture that balances flexibility and stability, decoupling motion control from task decision-making [8][20]. Group 3: Data Generation Process - DemoHLM allows for the generation of diverse training data from just one demonstration, automating the process through three stages: pre-operation, operation, and batch synthesis, which enhances the generalization capability of the strategy [9][20]. - The automated data generation process mitigates the traditional challenges of data collection in imitation learning, significantly improving efficiency [9][20]. Group 4: Experimental Validation - The framework was validated in both simulated environments and on a real Unitree G1 robot, demonstrating stable performance across ten mobile operation tasks, with significant improvements in success rates as synthetic data volume increased [10][15]. - The results showed that as the number of synthetic data points increased from 100 to 5000, success rates for tasks like "PushCube" and "OpenCabinet" improved dramatically, indicating the effectiveness of the data generation pipeline [15][20]. Group 5: Industry Implications and Future Directions - The breakthroughs achieved by DemoHLM provide critical technological support for the practical application of humanoid robots in various sectors, including household, industrial, and service environments [19][20]. - Future research will explore mixed training with real data and multi-modal perception to enhance robustness and address current limitations, such as reliance on simulation data and performance in complex occlusion scenarios [19][22].
只演示一次,机器人就会干活了?北大&BeingBeyond联合团队用“分层小脑+仿真分身”让G1零样本上岗
量子位· 2025-11-13 09:25
Core Insights - The article introduces the DemoHLM framework, which allows humanoid robots to generate extensive training data from a single human demonstration in a simulated environment, addressing key challenges in loco-manipulation [1][22]. Group 1: Challenges in Humanoid Robot Manipulation - Humanoid robot manipulation faces a "triple dilemma" due to limitations in existing solutions, which either rely on simulation or require extensive real-world remote operation data, making them impractical for complex environments like homes and industries [3][6]. - Traditional methods suffer from low data efficiency, poor task generalization, and difficulties in sim-to-real transfer, leading to high costs and limited scalability [6][20]. Group 2: Innovations of DemoHLM - DemoHLM employs a hierarchical control architecture that separates motion control from task decision-making, enhancing both flexibility and stability [7][20]. - The framework's key innovation is the ability to generate a vast amount of diverse training data from just one demonstration, significantly improving data efficiency and generalization capabilities [8][20]. Group 3: Experimental Validation - Comprehensive validation was conducted in both simulated environments (IsaacGym) and on the real Unitree G1 robot, covering ten manipulation tasks with notable success rates [9][19]. - As synthetic data volume increased from 100 to 5000, success rates for tasks improved significantly, demonstrating the effectiveness of the data generation pipeline [14][20]. Group 4: Industry Implications and Future Directions - DemoHLM's advancements provide critical technical support for the practical application of humanoid robots, reducing training costs and enhancing generalization across various scenarios [19][20]. - The framework is designed to be compatible with future upgrades, such as tactile sensors and multi-camera perception, paving the way for more complex operational environments [21][20].
波士顿动力狗gogo回来了,“五条腿”协同发力
3 6 Ke· 2025-10-15 13:02
Core Insights - Boston Dynamics' Spot robot can lift a 15 kg tire in just 3.7 seconds, showcasing advanced dynamic whole-body manipulation techniques [1][11] - The robot's performance exceeds traditional static assumptions, demonstrating the ability to coordinate movements effectively beyond its maximum lifting capacity [13] Group 1: Dynamic Whole-Body Manipulation - The method combines sampling and learning to enable the robot to perform tasks requiring coordination of arms, legs, and torso [1][2] - A hierarchical control approach divides the control problem into two layers: low-level control for balance and stability, and high-level control for task-specific strategies [2][14] Group 2: Control Strategies - The low-level control uses reinforcement learning to manage motor torque for stability, while high-level control employs sampling-based strategies for tasks like tire alignment and stacking [2][7] - The sampling controller simulates multiple future scenarios in parallel to identify the most effective actions for task completion [3][5] Group 3: Performance Metrics - The robot achieved an average time of 5.9 seconds per tire, nearly matching human operational speed [11] - The dynamic coordination allows the robot to handle weights significantly exceeding its peak lifting capabilities, expanding its operational range [13][14] Group 4: Learning and Adaptation - The training process incorporates randomization of object properties to bridge the gap between simulation and real-world application [10] - The use of an asymmetric actor-critic architecture for training enhances the robot's ability to adapt to complex dynamics and contact mechanics [8][10]
波士顿动力狗gogo回来了!“五条腿”协同发力
量子位· 2025-10-15 10:20
Core Insights - The article discusses the advancements in Boston Dynamics' Spot robot, which can lift and manipulate a tire weighing 15 kg in just 3.7 seconds, showcasing its dynamic whole-body manipulation capabilities [3][31]. Group 1: Dynamic Whole-Body Manipulation - The method combines sampling and learning for dynamic whole-body manipulation, utilizing reinforcement learning and sampling-based control to enable coordinated tasks involving arms, legs, and torso [11][12]. - A hierarchical control approach is employed, dividing control problems into two complementary layers: a low layer for direct motor torque control and a high layer for task-specific strategies [12][13]. Group 2: Task Execution and Control Strategies - For tasks like tire alignment and stacking, the system uses sampling-based control to simulate potential future scenarios and discover optimal strategies [14]. - Reinforcement learning is applied to maintain stability during rolling tasks, capturing the necessary dynamic features and reactive control mechanisms [15][26]. Group 3: Performance and Efficiency - The Spot robot's performance in tire manipulation exceeds traditional static assumptions, demonstrating the ability to handle weights beyond its peak lifting capacity of 11 kg [35]. - The robot's dynamic coordination of movements allows it to efficiently perform tasks that were previously limited to slower, static methods [36][33]. Group 4: Simplification of Control Problems - Separating high-level and low-level control significantly simplifies the control challenges, allowing the high-level controller to focus on task completion without needing to reason about joint torques or stability constraints [37][38]. - The learned motion abstractions enable the high-level controller to operate in a simplified action space, enhancing computational feasibility and task execution efficiency [38].