IROS 2025-Challenge冠军方案：X-VLA重磅开源，全面刷新机器人基准性能记录

Core Insights - The article discusses the launch of the X-VLA model, a groundbreaking open-source model in the field of embodied intelligence, achieving significant performance improvements with only 0.9 billion parameters [2][7]. Competition Highlights - The AGIBOT World Challenge 2025 attracted 431 teams from 23 countries, with 11 teams competing in the final event held in Hangzhou, China, focusing on real-world physical tasks [4][5]. Performance Breakthroughs - X-VLA achieved state-of-the-art (SOTA) performance across five authoritative simulation benchmarks, demonstrating exceptional efficiency and effectiveness in long-duration tasks like autonomous clothing folding [7][24]. Innovative Techniques - The model employs a Soft-Prompt mechanism to enhance adaptability across different robotic platforms, and a multi-modal encoding strategy to optimize resource allocation while maintaining information integrity [16][21]. - A flow-matching generative action decoder is utilized to improve the smoothness and robustness of action trajectories in uncertain environments [17]. Data Preprocessing and Training - The model incorporates a balanced data sampling strategy and a rigorous data cleaning pipeline to ensure high-quality training data, which is crucial for learning meaningful behavior knowledge [21][22]. - The training process includes a customized post-training workflow that allows for efficient adaptation to specific tasks using smaller datasets [23][26]. Real-World Testing - X-VLA demonstrated strong performance in real robotic platforms, successfully completing complex tasks such as infinite-duration autonomous clothing folding, showcasing its capability in handling intricate long-range tasks [27].