Pi系列
Search documents
GEN-0 以及后续的 VLA 发展的看法
具身智能之心· 2025-11-21 00:04
Core Insights - The release of GEN-0 marks a significant advancement in the field of embodied intelligence, particularly in manipulation tasks, which have historically faced challenges due to data scarcity and the difficulty of generalization [1][2] - GEN-0 has leveraged a massive dataset of 270,000 hours, equivalent to approximately 31 years, and continues to collect data at a rate of 10,000 hours per week, surpassing previous models like the Pi series in pre-training effectiveness [2][3] - Despite its advancements, GEN-0 has not achieved a "GPT moment" or true zero-shot capabilities, indicating ongoing challenges in the field [2][3] Data Collection and Utilization - The data collection strategy for GEN-0 emphasizes the importance of data diversity and quality over sheer quantity, as evidenced by the scaling laws observed in the model's performance [10][13] - The emergence of UMI (Unified Multi-Instance) has posed challenges to traditional simulation methods, highlighting the need for real-world data collection to achieve high success rates in manipulation tasks [5][7] - The success rate of real-world data collection approaches 100%, while simulation methods face significant challenges, particularly in generating long-horizon data [8][9] Model Training and Performance - GEN-0's results suggest that larger models are necessary to effectively utilize vast amounts of data, as smaller models struggle to generalize under data overload conditions [11][12] - Pre-training in GEN-0 focuses on learning action space exploration rather than generalization, indicating a shift in how models are trained to handle diverse tasks [12] - The insights gained from GEN-0's pre-training process emphasize the need for a deeper understanding of data quality and diversity, which can significantly impact model performance [10][13] Future Directions - The findings from GEN-0 challenge existing paradigms in the field, suggesting that new engineering efforts and problem-solving approaches are required to advance embodied intelligence [15] - The industry is expected to see a shift towards larger model infrastructures and a focus on co-training methodologies to enhance model capabilities [11][14] - The ongoing development of data collection environments and pre-training methodologies will likely shape the future landscape of embodied intelligence research [15][16]