Macro-to-Micro Flow (M2Flow) - filings, earnings calls, financial reports, news

Macro-to-Micro Flow (M2Flow)

Search documents

近2k star的RLinf又又又上新了！支持真机强化学习，像使用GPU一样使用你的机器人~

具身智能之心· 2025-12-26 03:38

Core Insights - The article discusses the advancements in the RLinf framework, particularly the release of RLinf v0.2, which supports real-world reinforcement learning and aims to enhance the capabilities of embodied intelligence systems [3][5]. Group 1: RLinf v0.2 Features - RLinf v0.2 allows users to utilize robots as flexible resources similar to GPUs, enabling the deployment of workers on robots by simply accessing their IP and port [3][6]. - The framework supports heterogeneous soft and hardware cluster configurations, accommodating the diverse requirements of real-world reinforcement learning [8][10]. - RLinf v0.2 introduces a fully asynchronous off-policy algorithm design, which decouples inference and training nodes, significantly improving training efficiency [11][14]. Group 2: Experimental Results - The initial version of RLinf v0.2 was tested using a Franka robotic arm on two tasks: Charger and Peg Insertion, achieving convergence within 1.5 hours for both tasks [12][15]. - The success rates for the tasks were impressive, with Peg Insertion achieving over 100 consecutive successes and Charger over 50 consecutive successes after training [15][18]. - The training process was documented through videos, showcasing the simultaneous operation of two Franka robotic arms in different locations [16][23]. Group 3: Development Philosophy - The RLinf team emphasizes the collaborative evolution of algorithms and infrastructure, aiming to create a new research ecosystem for embodied intelligence [20]. - The team is composed of members from various institutions, including Tsinghua University and Peking University, highlighting a diverse background in infrastructure, algorithms, and robotics [20].

具身智能

真机强化学习

大规模分布式真机强化学习训练范式

Macro-to-Micro Flow (M2Flow)

Macro-to-Micro Flow (M2Flow)

RLinf

franka机械臂

全异构、全异步的RLinf v0.2尝鲜版发布，支持真机强化学习，像使用GPU一样使用你的机器人！

机器之心· 2025-12-26 03:06

Core Insights - The article discusses the ongoing data debate in the field of embodied intelligence, particularly between simulation data and real machine data, emphasizing the need for infrastructure that supports various technological explorations [2] Summary by Sections RLinf v0.2 Features - RLinf v0.2 is designed for users adopting real machine routes, supporting real machine reinforcement learning [2][4] - Users can utilize robots as flexible resources similar to GPUs, allowing for easy integration and configuration through a YAML file, significantly lowering usage costs [5][6] - The system aims to enable large-scale distributed real machine reinforcement learning, addressing challenges in stability, usability, and flexibility [9] Heterogeneous Hardware Support - RLinf supports flexible configurations of heterogeneous software and hardware clusters, enhancing system throughput and training efficiency [11][12] - It allows integration of various hardware setups, such as high-fidelity simulators on RTX 4090 GPUs, training on large memory GPUs like A800, and running robot controllers on CPU machines [13][14] Asynchronous Off-Policy Algorithms - RLinf v0.2 introduces a fully asynchronous design, decoupling inference nodes from training nodes, which significantly improves training efficiency [16] - It incorporates typical off-policy reinforcement learning algorithms, enhancing data utilization by leveraging both online and offline data [16] Experimental Results - The initial version of RLinf focuses on small model real machine reinforcement learning, using the Franka robotic arm for two quick validation tasks: Charger and Peg Insertion [19][21] - The training process includes human-in-the-loop interventions to improve efficiency, with successful results documented in training videos [21][22] Community Engagement - The RLinf team expresses gratitude to its community of 2,000 users, whose feedback has driven continuous improvements and feature updates since its release [22]

具身智能

大规模分布式真机强化学习训练范式

Macro-to-Micro Flow (M2Flow)

Macro-to-Micro Flow (M2Flow)

全异步off-policy算法

RLinf v0.2

Franka机械臂