Dispersive Loss
Search documents
MeanFlow再下一城,北大提出机器人学习新范式MP1,实现速度与成功率双SOTA
机器之心· 2025-07-24 09:33
Core Viewpoint - The article discusses the introduction of a new robotic learning framework called MP1, which significantly improves the efficiency and effectiveness of robotic manipulation tasks by addressing challenges in action generation and few-shot generalization [4][11]. Group 1: Action Generation Models - The current VLA model's action generation quality and speed are determined by the action generation model, which faces a fundamental trade-off between inference speed and task success rate [2]. - Diffusion Models generate high-quality action sequences through multi-step iterations but are slow, while Flow-based models offer faster inference at the cost of added complexity and potential performance limitations [2][3]. Group 2: MP1 Framework - MP1 introduces the MeanFlow paradigm from image generation to robotic learning, achieving millisecond-level inference speed and laying the foundation for VLA action generation models [4]. - The core innovation of MP1 is the shift in its generation paradigm, learning an interval-averaged velocity field instead of an instantaneous velocity field, which eliminates the need for time-consuming iterative solving [8]. Group 3: Few-Shot Generalization - MP1 addresses the challenge of feature collapse in robotic learning by introducing Dispersive Loss, which optimizes the internal representation space of the policy network, enhancing its ability to distinguish between different states [11][12]. - This loss function operates only during training, significantly improving the model's ability to learn from a minimal number of demonstrations, which is crucial in high-cost data collection environments [12]. Group 4: Performance Testing - MP1 has been validated through simulations across 37 complex tasks, demonstrating superior task success rates and stability compared to existing models [15][16]. - The average success rate of MP1 reached 78.9%, outperforming FlowPolicy and DP3 by 7.3% and 10.2%, respectively, particularly excelling in more difficult tasks [17]. Group 5: Inference Efficiency - MP1 achieved an average inference time of only 6.8 ms on an NVIDIA RTX 4090 GPU, making it nearly twice as fast as FlowPolicy and 19 times faster than DP3, thus meeting the real-time control frequency requirements in robotics [18][19]. Group 6: Few-Shot Learning Validation - Experiments confirmed that MP1 consistently outperformed FlowPolicy across all data scales, especially in extreme few-shot scenarios with only 2-5 demonstrations, validating the effectiveness of Dispersive Loss in enhancing generalization [21]. Group 7: Real-World Validation - In real-world tests, MP1 achieved the highest success rates and shortest task completion times across five tasks, with a success rate of 90% in the "Hummer" task, significantly surpassing FlowPolicy and DP3 [23].