TrajBooster

Search documents
跨形态学习来了!轮式机器人的“经验”如何轻松传给双足机器人?
机器人大讲堂· 2025-09-23 13:24
Core Insights - The article discusses the rapid advancements in humanoid robot technology, particularly focusing on the Visual-Language-Action (VLA) model systems that can perform various household tasks with high reliability and generalization capabilities. However, a significant bottleneck remains due to the lack of high-quality, comprehensive demonstration data for bipedal robots [1][20]. Group 1: TrajBooster Framework - The TrajBooster framework was proposed by research teams from Zhejiang University and Westlake University to address the challenge of data scarcity by utilizing rich operational data from wheeled robots and trajectory redirection technology to enhance the action learning efficiency of bipedal humanoid robots [1][20]. - The core idea of TrajBooster is to use the 6D end-effector trajectory (3D position + 3D rotation) as a universal interface, allowing for "cross-modal" teaching regardless of robot morphology [2][4]. Group 2: Process Overview - The process involves three main stages: 1. Source data extraction from large datasets of wheeled robots, including language instructions, multi-view visual observations, and corresponding 6D end-effector trajectories [4]. 2. Trajectory redirection in a simulated environment to teach the target bipedal robot how to coordinate its joints to follow these trajectories [4][5]. 3. Model training and fine-tuning using minimal real data from the target robot to deploy the model effectively in real-world scenarios [4][9]. Group 3: Model Architecture - The model architecture consists of a hierarchical control model that breaks down complex problems into manageable sub-problems, with an upper layer for inverse kinematics (IK) to control the arms and a lower layer for a hierarchical reinforcement learning (RL) strategy to manage the legs and balance [5][8]. - The management policy acts as a "decision brain" to determine how the robot should move to reach the target position, while the worker policy translates these commands into specific joint actions [8]. Group 4: Training Phases - The training process includes two phases: Post-Pre-Training (PPT) and Post-Training (PT). PPT combines redirected action data with source data to create a new dataset for further pre-training the VLA model, allowing it to understand the action space of the target robot [9][10]. - The PT phase involves collecting only 10 minutes of real remote operation data to fine-tune the model, bridging the gap between simulation and reality, thus significantly reducing data collection costs [11]. Group 5: Experimental Results - Experiments conducted on the Unitree G1 bipedal robot demonstrated that the model trained with PPT outperformed models trained solely on real data, achieving significant performance improvements in tasks such as "grabbing Mickey Mouse" and "organizing toys" [12][15]. - The model's ability to perform zero-shot skill transfer was highlighted, as it successfully completed tasks not seen during training, indicating effective skill inheritance through trajectory transfer [15][16]. - The model also showed enhanced trajectory generalization capabilities, achieving an 80% success rate in novel object placements compared to 0% for models not using PPT, demonstrating a deeper understanding of the action space [16].
TrajBooster:首个全身人行操作VLA方案,跨构型解决数据难题(代码全开源)
具身智能之心· 2025-09-18 00:03
Core Insights - The article discusses the TrajBooster framework, which aims to enhance the capabilities of humanoid robots by utilizing a trajectory-centric learning approach, enabling them to perform complex household tasks with minimal training data [2][40]. Group 1: Research Background and Challenges - The development of humanoid robots faces two main challenges: the unique difficulties of maintaining dynamic balance while performing upper body tasks, and the scarcity of high-quality training data necessary for effective VLA model training [3][4]. - Existing methods rely on expensive equipment and expert operators, resulting in limited data sets that do not adequately cover the diverse action spaces required for humanoid robots [4]. Group 2: TrajBooster Framework - TrajBooster utilizes a three-step process: real trajectory extraction, simulation redirection, and dual-stage fine-tuning, allowing for the conversion of extensive wheeled robot data into effective training resources for bipedal robots [5][40]. - The framework significantly reduces the dependency on costly data from similar robot types, enabling zero-shot skill transfer and improving the robustness and generalization of the VLA models [2][5]. Group 3: Methodology - The framework begins with extracting real trajectories from the Agibot-World Beta dataset, which contains over 1 million real robot trajectories, and then maps this data to the Unitree G1 robot's operational space [7][9]. - A hierarchical composite model is employed to decouple control into upper and lower body systems, enhancing the efficiency of whole-body manipulation [11][12]. Group 4: Experimental Results - TrajBooster demonstrated superior performance in various tasks, achieving the lowest position error (2.851 cm) and rotation error (6.231 degrees) in mobile scenarios, validating the advantages of hierarchical training and coordinated online DAgger [27]. - The framework's ability to adapt to unseen tasks was evidenced by its success in a "water transfer" task, which was not included in the training data, showcasing improved generalization capabilities [39][40]. Group 5: Limitations and Future Directions - The current implementation is limited by the precision of the Unitree Dex-3 hand, which only supports simple grasping tasks; future work will focus on integrating dexterous hands with tactile sensing for more complex manipulations [41]. - There is a need to address the visual input discrepancies and expand the framework to include mobile manipulation data, as the current research is primarily focused on static tasks [43][44].