NeurIPS 2025 | CMU、清华、UTAustin开源ReinFlow,用在线RL微调机器人流匹配策略
机器之心·2025-10-20 09:15

Core Insights - The article discusses the emergence of ReinFlow, an online reinforcement learning framework designed to fine-tune flow matching policies, which has been accepted at NeurIPS 2025 and is open-sourced with comprehensive documentation [2][5][27]. Group 1: ReinFlow Overview - ReinFlow is a general framework applicable to all strategies defined by ordinary differential equations, such as Rectified Flow and Shortcut Models, and supports inference with minimal steps [12]. - The framework significantly reduces training time by over 60% compared to DPPO while maintaining similar performance levels [14][16]. Group 2: Algorithm Characteristics - ReinFlow utilizes a strategy gradient theory to convert deterministic flows into discrete-time Markov processes, optimizing the entire flow matching chain [5][7]. - The algorithm introduces a small amount of learnable noise into the deterministic path of the flow strategy, allowing for a stochastic diffusion process that enhances exploration while controlling deviation from the pre-trained strategy [8][10]. Group 3: Performance Metrics - In D4RL locomotion tasks, ReinFlow fine-tuned Rectified Flow strategies achieved an average net performance increase of 135.36%, while reducing the wall-clock time for fine-tuning by 82.63% [16]. - For long-range operation tasks, ReinFlow fine-tuned Shortcut Model strategies improved success rates by an average of 40.34% with fewer diffusion steps, saving an average of 23.20% in training time [18]. Group 4: Experimental Validation - The research team conducted ablation studies to assess the impact of various factors on training outcomes, demonstrating that reinforcement learning fine-tuning can further enhance performance beyond mere data augmentation [24]. - The framework has been validated across multiple benchmark tasks, showing significant performance improvements compared to pre-trained models [14]. Group 5: Open Source and Future Directions - ReinFlow's GitHub project is fully open-sourced and actively maintained, providing a complete codebase, model checkpoints, and detailed documentation for community engagement [27]. - Future updates will include support for various flow models, classic RL environments, and comprehensive guides for installation and usage [29].

NeurIPS 2025 | CMU、清华、UTAustin开源ReinFlow,用在线RL微调机器人流匹配策略 - Reportify