西湖大学最新!RobustVLA:面向VLA模型的鲁棒性感知强化后训练方法(优于SOTA方案)
具身智能之心·2025-11-08 04:00

Core Insights - The article discusses the development of RobustVLA, a lightweight online reinforcement learning post-training method aimed at enhancing the robustness of Vision-Language-Action (VLA) models in the face of environmental uncertainties [1][5][20] - It highlights the limitations of existing methods that focus primarily on reward maximization without addressing the model's sensitivity to disturbances, which can lead to significant performance drops in real-world scenarios [5][20] Design Logic of RobustVLA - RobustVLA incorporates two key regularization terms: Jacobian regularization to reduce sensitivity to observation noise and smoothness regularization to stabilize policies under action disturbances [4][7][8] - The method emphasizes the importance of robustness-aware reinforcement learning post-training as a critical step in improving the reliability of VLA models [1][5] Robustness Analysis - The article outlines a theoretical analysis of robustness, establishing error amplification bounds, reward drift control, and guarantees for robust stability [4][11][18] - It identifies that the Jacobian sensitivity directly impacts error amplification, and reducing this sensitivity can effectively constrain performance loss [12][18] Experimental Results - In experiments, RobustVLA demonstrated an average success rate of 82.5% under observation perturbations, outperforming previous models like OpenVLA-OFT and RIPT-VLA [20][21] - Under action perturbations, RobustVLA achieved an average success rate of 54.8%, exceeding OpenVLA-OFT's 53.5% [22] - In scenarios with combined disturbances, RobustVLA-C achieved an average success rate of 82.1%, showcasing the synergy of autonomous interaction and dual regularization [23] Transfer Learning and Ablation Studies - Transfer learning experiments showed that RobustVLA improved out-of-distribution adaptability by 8.0% and 16.0% in specific tasks compared to zero-shot transfer [25] - Ablation studies confirmed that removing either Jacobian or smoothness regularization led to performance declines, underscoring the necessity of both regularization strategies for enhancing robustness [27]