RLinf - filings, earnings calls, financial reports, news - Reportify

RLinf

Search documents

近2k star的RLinf又又又上新了！支持真机强化学习，像使用GPU一样使用你的机器人~

具身智能之心· 2025-12-26 03:38

Core Insights - The article discusses the advancements in the RLinf framework, particularly the release of RLinf v0.2, which supports real-world reinforcement learning and aims to enhance the capabilities of embodied intelligence systems [3][5]. Group 1: RLinf v0.2 Features - RLinf v0.2 allows users to utilize robots as flexible resources similar to GPUs, enabling the deployment of workers on robots by simply accessing their IP and port [3][6]. - The framework supports heterogeneous soft and hardware cluster configurations, accommodating the diverse requirements of real-world reinforcement learning [8][10]. - RLinf v0.2 introduces a fully asynchronous off-policy algorithm design, which decouples inference and training nodes, significantly improving training efficiency [11][14]. Group 2: Experimental Results - The initial version of RLinf v0.2 was tested using a Franka robotic arm on two tasks: Charger and Peg Insertion, achieving convergence within 1.5 hours for both tasks [12][15]. - The success rates for the tasks were impressive, with Peg Insertion achieving over 100 consecutive successes and Charger over 50 consecutive successes after training [15][18]. - The training process was documented through videos, showcasing the simultaneous operation of two Franka robotic arms in different locations [16][23]. Group 3: Development Philosophy - The RLinf team emphasizes the collaborative evolution of algorithms and infrastructure, aiming to create a new research ecosystem for embodied intelligence [20]. - The team is composed of members from various institutions, including Tsinghua University and Peking University, highlighting a diverse background in infrastructure, algorithms, and robotics [20].

真机强化学习

大规模分布式真机强化学习训练范式

Macro-to-Micro Flow (M2Flow)

franka机械臂

真机强化学习

大规模分布式真机强化学习训练范式

Macro-to-Micro Flow (M2Flow)

franka机械臂

今年大概率产了n篇VLA+RL工作吧？！

具身智能之心· 2025-12-22 10:23

Core Insights - The article emphasizes the integration of Reinforcement Learning (RL) with Vision-Language-Action (VLA) models to enhance their generalization capabilities, particularly in out-of-distribution (OOD) scenarios, where performance improvements can reach up to 42.6% [2]. Group 1: Research Directions - The article suggests that future research should focus on the combination of VLA and RL, encouraging collaboration with research assistants for guidance on starting projects in these areas [3]. - Several notable recent works in VLA+RL have been highlighted, showcasing significant advancements in the field [5][10]. Group 2: Notable Papers and Projects - A list of representative papers from the last two years is provided, including titles such as "NORA-1.5" and "Balancing Signal and Variance," which focus on various aspects of VLA and RL integration [5][10]. - Links to project homepages and paper PDFs are shared for further exploration of these works [6][9][12]. Group 3: Tools and Frameworks - The article mentions the development of tools like Rlinf, which supports a growing number of methods for VLA+RL frameworks, indicating a trend towards more robust and versatile research tools [2][11].

强化学习（RL）

视觉语言动作模型（VLA）

强化学习（RL）

视觉语言动作模型（VLA）

VLA+RL正在不断拉升着具身操作的上限!

具身智能之心· 2025-11-11 00:02

Core Insights - The article discusses the integration of Reinforcement Learning (RL) with Visual Language Models (VLA), highlighting how RL enhances the capabilities of VLA by bridging the gap between pre-training and real-world tasks [1][4]. Group 1: Technical Developments - RL training models directly optimize the "complete task" goal, allowing models to handle unexpected situations not present in training data, thus improving robustness [1]. - The reward mechanism enables VLA to learn smoother trajectories and align more closely with the physical world [1]. - A recommended open-source repository for VLA+RL methods is provided, facilitating entry-level research [2]. Group 2: Evaluation Results - Evaluation results on various LIBERO task groups show significant performance metrics for different models, with the π0.5 model achieving an average accuracy of 96.9% across tasks [5]. - The Flow-SDE π0 model demonstrated a 38.5% improvement in average accuracy when combined with RL [5]. Group 3: Community and Resources - The community offers continuous live sharing sessions, including roundtable forums and discussions on various topics within the embodied intelligence industry [7]. - A comprehensive technical roadmap is available for beginners, outlining essential technologies and learning paths [9]. - The community has established job referral mechanisms with several companies in the embodied intelligence sector, providing valuable networking opportunities [13]. Group 4: Educational Materials - The community has compiled over 40 open-source projects and nearly 60 datasets related to embodied intelligence, along with mainstream simulation platforms and various technical learning routes [15]. - Specific learning routes for different aspects of embodied intelligence, such as reinforcement learning and multi-modal large models, are detailed to assist learners at various levels [16][42]. Group 5: Industry Insights - The community includes members from renowned universities and leading companies in the field, fostering a rich environment for academic and industrial exchange [14]. - Regular updates on academic progress and industrial applications in embodied intelligence are shared, keeping members informed about the latest developments [21][23].

【产业互联网周报】《上海合作组织成员国元首理事会关于进一步深化人工智能国际合作的声明》发布；工信部：前7个月软件业务收入83246亿元，同比增长12....

Tai Mei Ti A P P· 2025-09-08 02:52

Domestic News - Meituan officially released and open-sourced LongCat-Flash-Chat, featuring an innovative Mixture-of-Experts architecture with a total of 560 billion parameters and an activation parameter range of 18.6 billion to 31.3 billion [2] - Tsinghua University and other institutions open-sourced RLinf, a large-scale reinforcement learning framework for embodied intelligence, achieving over 120% system speedup compared to other frameworks [3] - A batch of national standards related to AI-generated content identification and safety measures for electric bicycles will be implemented starting September 1, aimed at promoting healthy development in emerging industries [4] - Beijing Data Group is expected to be officially listed soon, with a registered capital of 3 billion yuan, focusing on big data services and AI public service platform technology consulting [5][6] - Sanwei Xinan is actively laying out Web3.0 applications, focusing on stablecoins and RWA, and has established itself as a vice-chairman unit of the Hong Kong Web3.0 Standardization Association [7] - Alibaba launched the AgentScope 1.0 framework for multi-agent development, providing a comprehensive solution for the entire lifecycle of intelligent applications [8] - Tencent announced the open-sourcing of the Youtu-Agent framework, which does not require additional model training and is based entirely on the open-source ecosystem [9] - Tencent released the HunyuanWorld-Voyager model, the first to support native 3D reconstruction for virtual reality and gaming applications [10] - Digital China is expanding into the robotics industry and has formed partnerships with leading companies in the field [11] - ByteDance plans to issue stock options to its Seed department, focusing on large model technology personnel [12] - Douyin established a new company focused on AI applications in healthcare, with a registered capital of 100,000 yuan [13] Financing and Mergers - Obita completed over 10 million USD in angel round financing, with funds aimed at core system development and market expansion [25] - New Unisplendour Group established a high-tech company with a registered capital of 10 million yuan, focusing on integrated circuit design and sales [26] - Ant Group's subsidiary invested in Xinyuan Semiconductor, increasing its registered capital from approximately 46.5 million yuan to about 50.3 million yuan [27] - AI company Anthropic raised 13 billion USD in a new funding round, increasing its valuation to 183 billion USD [28] - OpenAI agreed to acquire product testing startup Statsig for 1.1 billion USD in stock, marking one of its largest acquisitions [29] Policies and Trends - The National Development and Reform Commission plans to issue "AI vouchers" to promote the use of intelligent terminals and reduce R&D costs [33] - The Ministry of Industry and Information Technology announced plans to support the development of high-performance AI training and inference chips [44] - The Ministry of Industry and Information Technology reported that software business revenue reached 83.246 billion yuan in the first seven months, a year-on-year increase of 12.3% [41] - The Ministry of Industry and Information Technology indicated that internet enterprises achieved a total profit of 93.88 billion yuan in the first seven months, a year-on-year decrease of 1.8% [42] - Shanghai is organizing the 2025 "AI+" action project application, focusing on enhancing AI capabilities and promoting industry development [45] - The National Standards Committee plans to revise over 4,000 national standards in fields such as AI and IoT [46]

软件与服务

LongCat-Flash-Chat

软件与服务

LongCat-Flash-Chat

首个具身智能大规模强化学习框架RLinf开源无问芯穹联合清华等机构打造

Bei Jing Shang Bao· 2025-09-01 05:05

Core Insights - The company announced the launch of the RLinf framework, a large-scale reinforcement learning framework aimed at embodied intelligence, in collaboration with Tsinghua University, Beijing Zhongguancun College, Peking University, and the University of California, Berkeley [1] Group 1: Framework Details - RLinf stands for "reinforcement learning infrastructure" and signifies "infinite" scalability, addressing limitations in current frameworks for supporting embodied intelligence [1] - The framework integrates "perception" and "action," requiring a balance between reasoning (brain) and execution (small brain), with higher demands on computing power, memory, and framework flexibility compared to pure inference models [1] Group 2: Technical Structure - RLinf is designed with six layers: user layer, task layer, execution layer, scheduling layer, communication layer, and hardware layer, specifically targeting technical challenges in the field [1]

首个具身智能大规模强化学习框架RLinf开源，无问芯穹联合清华等机构打造

Bei Jing Shang Bao· 2025-09-01 04:49

Core Insights - The company announced the launch of the RLinf framework, a large-scale reinforcement learning framework aimed at embodied intelligence, in collaboration with several prestigious institutions [1] Group 1: Framework Overview - RLinf stands for "reinforcement learning infrastructure" and symbolizes "infinite" scalability, addressing limitations in current frameworks for supporting embodied intelligence [1] - The framework integrates "perception" and "action," requiring a balance between reasoning (brain) and execution (cerebellum), which imposes higher demands on computing power, memory, and framework flexibility compared to pure inference models [1] Group 2: Technical Design - RLinf is designed with six layers: user layer, task layer, execution layer, scheduling layer, communication layer, and hardware layer, specifically targeting technical challenges in the field [1]

Artificial Intelligence

Artificial Intelligence

RLinf开源！首个面向具身智能“渲训推一体化”的大规模强化学习框架

具身智能之心· 2025-09-01 04:02

Core Viewpoint - The article discusses the launch of RLinf, a large-scale reinforcement learning framework aimed at embodied intelligence, highlighting its innovative design and capabilities in enhancing AI's transition from perception to action [2][5]. Group 1: Framework Overview - RLinf is a flexible and scalable framework designed for embodied intelligence, integrating various components to optimize performance [5]. - The framework's name "inf" signifies both "infrastructure" and "infinite" scaling, emphasizing its adaptable system design [7]. - RLinf features a hybrid execution model that achieves over 120% system speedup compared to traditional frameworks, with VLA model performance improvements of 40%-60% [7][12]. Group 2: Execution Modes - RLinf supports three execution modes: Collocated, Disaggregated, and Hybrid, allowing users to configure components based on their needs [17][15]. - The hybrid mode combines the advantages of both shared and separated execution, minimizing system idle time and enhancing efficiency [12][15]. Group 3: Communication and Scheduling - The framework includes an adaptive communication library designed for reinforcement learning, optimizing data exchange between components [19][22]. - RLinf features an automated scheduling module that minimizes resource idleness and dynamically adjusts to user training flows, achieving rapid scaling capabilities [23][24]. Group 4: Performance Metrics - RLinf has demonstrated significant performance improvements in embodied intelligence tasks, achieving success rates of 80%-90% in specific scenarios, compared to 30%-50% in previous models [24][26]. - The framework has also achieved state-of-the-art (SOTA) performance in mathematical reasoning tasks across multiple datasets, showcasing its versatility [29][30]. Group 5: Documentation and Community Engagement - Comprehensive documentation and API support are provided to enhance user experience and facilitate understanding of the framework [32][34]. - The RLinf team encourages collaboration and invites users to explore the framework, highlighting ongoing recruitment for various research and engineering positions [33][34].

渲训推一体化

渲训推一体化

首个为具身智能而生的大规模强化学习框架RLinf！清华、北京中关村学院、无问芯穹等重磅开源

机器之心· 2025-09-01 02:49

Core Viewpoint - The article discusses the launch of RLinf, a large-scale reinforcement learning framework designed for embodied intelligence, emphasizing its flexible and scalable architecture that integrates training, rendering, and inference processes [5][7]. Group 1: Development of RL Framework - The transition in artificial intelligence from "perception" to "action" highlights the importance of embodied intelligence, which is gaining attention in both academia and industry [2][4]. - RLinf is developed collaboratively by Tsinghua University, Beijing Zhongguancun College, and Wuwenchin, aiming to address the limitations of existing frameworks in supporting embodied intelligence [5][7]. Group 2: Features of RLinf - RLinf's architecture consists of six layers: user layer, task layer, execution layer, scheduling layer, communication layer, and hardware layer, allowing for a hybrid execution mode that achieves over 120% system speedup [7][12]. - The framework introduces a Macro-to-Micro Flow (M2Flow) mechanism, enabling flexible construction of training processes while maintaining high programming flexibility and debugging ease [14][15]. Group 3: Execution Modes - RLinf supports three execution modes: Collocated Mode, Disaggregated Mode, and Hybrid Mode, allowing users to configure components for optimal resource utilization [19][20]. - The framework integrates low-intrusion multi-backend solutions to cater to the diverse needs of researchers in the embodied intelligence field [16][20]. Group 4: Communication and Scheduling - RLinf features an adaptive communication library designed for reinforcement learning, optimizing data exchange between components to enhance system efficiency [22][28]. - An automated scheduling module minimizes resource idling by analyzing component performance and selecting the best execution mode, significantly improving training stability [24][25]. Group 5: Performance Metrics - RLinf demonstrates superior performance in embodied intelligence tasks, achieving over 120% efficiency improvement compared to existing frameworks in specific tests [27][33]. - The framework has shown significant success rate improvements in various tasks, with models achieving up to 97.3% success rates in specific scenarios [31][35]. Group 6: Future Development and Community Engagement - The RLinf team emphasizes open-source principles, providing comprehensive documentation and support to enhance user experience and facilitate collaboration [40][41]. - The team is actively recruiting for various positions to further develop and maintain the RLinf framework, inviting community engagement and feedback [42][43].