Core Insights - The article discusses the introduction of InternData-A1, a synthetic dataset that overcomes the limitations of traditional robot training data by providing high-fidelity, large-scale, and low-cost data for Vision-Language-Action (VLA) models [1][2][21]. Group 1: Need for Reconstructing Robot Pre-training Data Paradigm - Current VLA model training faces a dilemma: real data is high fidelity but costly and limited in scale, while synthetic data lacks diversity and physical realism [2]. - InternData-A1 addresses this by combining high-quality synthetic data with a modular generation pipeline, ensuring scalability, diversity, and cost-effectiveness [2][21]. Group 2: Core Features of InternData-A1 - InternData-A1 encompasses a comprehensive robot interaction data system, covering 4 robot types, 70 tasks, and 227 scenes, with a total of 630,000 trajectories and 7,433 hours of interaction data [4][6]. - The dataset achieves high-fidelity simulation through optimized physics engines and visual rendering, minimizing the gap between simulation and real-world performance [6][21]. - A modular generation pipeline allows for low-cost, efficient data production, automating the processes of asset configuration, skill combination, domain randomization, and trajectory generation [8][9]. Group 3: Performance Comparison and Validation - Models pre-trained on InternData-A1 demonstrate top-tier performance in both simulated and real-world tasks, matching or exceeding the performance of models trained on real datasets [10][14]. - In simulated tasks, the success rate reached 60% in Easy mode and 26.5% in Hard mode, outperforming traditional models [11][12]. - The dataset shows that 1,600 synthetic data points can match the performance of 200 real data points, significantly reducing data collection costs [20][21]. Group 4: Future Directions - Future enhancements will focus on expanding task and robot type coverage, including high-precision dexterous tasks and more robot forms [19][20]. - The potential for synthetic data to replace real data in VLA model pre-training is emphasized, highlighting the importance of scalability, diversity, and fidelity in synthetic datasets [21][22].
InternData-A1开源:纯合成数据性能比肩顶级真实数据,效果媲美官方π0模型
具身智能之心·2025-11-28 00:04