大型行为模型(LBM)

Search documents
波士顿动力x TRI联手!使用大型行为模型(LBM)训练Atlas!目标“AI通才机器人”
机器人大讲堂· 2025-08-25 12:10
Core Viewpoint - The development of humanoid robots requires them to perform multiple tasks, including manipulating various objects and maintaining balance in unexpected situations. Large Behavior Models (LBM) are crucial for cultivating these core capabilities in humanoid robots [1][2]. Group 1: Collaboration and Development - Boston Dynamics has partnered with Toyota Research Institute (TRI) to develop LBM for the Atlas humanoid robot, utilizing end-to-end language modulation strategies to assist in long-term manipulation tasks [2]. - The strategy consists of four processes: collecting behavior data through remote operation, processing and annotating the data, training neural network strategies, and evaluating these strategies with a testing suite [3]. Group 2: Strategy Principles - Boston Dynamics follows three core principles in strategy formulation: maximizing task coverage through a remote operation system, adopting a multi-task training policy for better generalization, and building a robust infrastructure for rapid iteration and scientific rigor [5][9]. Group 3: Hardware and Software Configuration - The Atlas robot has 78 degrees of freedom (DoF) for extensive movement and flexibility, while the Atlas MTS focuses on pure manipulation tasks with 29 DoF [9]. - The remote operation system uses HDR stereo cameras for situational awareness and integrates with a model predictive controller (MPC) to ensure precise operations while maintaining balance [9][10]. Group 4: LBM Technology and Simulation - The LBM architecture is based on TRI's LBM, utilizing a diffusion transformer with 450 million parameters to predict actions based on sensory input and language prompts [11]. - Simulation technology plays a key role in development, allowing for efficient training and evaluation while sharing data pipelines and reducing costs [11]. Group 5: Enhanced Capabilities - Through LBM training, Atlas has surpassed traditional robotic limitations, enabling it to autonomously respond to unexpected situations and perform complex long-range tasks [12][14]. - The robot can execute a variety of tasks, from simple pick-and-place actions to complex operations like manipulating a 22-pound (9.9 kg) tire, showcasing the advantages of "learning by demonstration" [16]. Group 6: Future Directions - Boston Dynamics aims to expand its "data flywheel" to improve throughput, quality, task diversity, and difficulty while exploring new algorithmic concepts [19].
机器人「GPT时刻」来了?丰田研究院悄悄做了一场最严谨的VLA验证
具身智能之心· 2025-07-21 08:42
Core Viewpoint - The article discusses the advancements in robotic arms, particularly focusing on the development of Large Behavior Models (LBM) that enable robots to perform complex tasks autonomously, showcasing significant improvements in performance and capabilities compared to traditional models [3][7][15]. Summary by Sections Introduction to Robotic Arms - Robotic arms are typically associated with simple tasks like grabbing or serving ice cream, but the complexity increases exponentially when tasked with more intricate operations such as setting a table or assembling a bicycle [2][3]. Development of VLA Models - The recent progress in Visual-Language-Action (VLA) models has allowed robots to integrate multimodal information (images, instructions, scene semantics) and execute complex tasks, moving towards more intelligent and versatile systems [3][4]. Large Behavior Models (LBM) - LBM represents a significant advancement in robotic capabilities, built on diffusion model strategies, enabling robots to autonomously execute complex operations with impressive results [7][10][19]. - The research conducted by Toyota Research Institute (TRI) and led by notable scholars emphasizes the rigorous evaluation of these models, demonstrating their effectiveness in both simulated and real-world environments [9][10]. Training and Evaluation - The LBM was trained on a diverse dataset, including 1,700 hours of robot data, and underwent 1,800 real-world evaluations and over 47,000 simulated deployments, showcasing its robust performance [13][14]. - The findings indicate that even with limited training data, the model's performance significantly improves, suggesting a positive trend towards achieving effective data acquisition and performance enhancement [14][16]. Performance Metrics - The evaluation metrics included success rate and task completion, with a focus on relative success rates to better compare different methods' performances [26][27]. - The LBM demonstrated superior performance in both seen and unseen tasks compared to single-task baseline models, indicating its robustness and adaptability [31][39]. Conclusion and Future Implications - The research suggests that the advent of general large-scale models in robotics is on the horizon, hinting at a potential "GPT moment" for embodied intelligence [15][43]. - The results indicate that pre-training can lead to better task performance with less data, reinforcing the idea that as data volume increases, performance benefits will continue to manifest [43][45].