RLinf
Search documents
VLA+RL正在不断拉升着具身操作的上限!
具身智能之心· 2025-11-11 00:02
点击下方 卡片 ,关注" 具身智能 之心 "公众号 RLinf通过标准化的接口,支持主流VLA模型及基于CPU与GPU的模拟器,并率先实现了对π0和π0.5模型系列的强化学习微调,欢迎大家star&follow。 | | | Evaluation results on the four LIBERO task groups | | | | | | --- | --- | --- | --- | --- | --- | --- | | Model | | | | LIBERO | | | | | Spatial | Object | Goal | Long | Avg. | A Avg. | | Full Dataset SFT | | | | | | | | Octo | 78.9% | 85.7% | 84.6% | 51.1% | 75.1% | - | | OpenVLA | 84.7% | 88.4% | 79.2% | 53.7% | 76.5% | - | | Tlfast | 96.4% | 96.8% | 88.6% | 60.2% | 85.5% | - | | OpenVLA-OFT | ...
【产业互联网周报】 《上海合作组织成员国元首理事会关于进一步深化人工智能国际合作的声明》发布;工信部:前7个月软件业务收入83246亿元,同比增长12....
Tai Mei Ti A P P· 2025-09-08 02:52
Domestic News - Meituan officially released and open-sourced LongCat-Flash-Chat, featuring an innovative Mixture-of-Experts architecture with a total of 560 billion parameters and an activation parameter range of 18.6 billion to 31.3 billion [2] - Tsinghua University and other institutions open-sourced RLinf, a large-scale reinforcement learning framework for embodied intelligence, achieving over 120% system speedup compared to other frameworks [3] - A batch of national standards related to AI-generated content identification and safety measures for electric bicycles will be implemented starting September 1, aimed at promoting healthy development in emerging industries [4] - Beijing Data Group is expected to be officially listed soon, with a registered capital of 3 billion yuan, focusing on big data services and AI public service platform technology consulting [5][6] - Sanwei Xinan is actively laying out Web3.0 applications, focusing on stablecoins and RWA, and has established itself as a vice-chairman unit of the Hong Kong Web3.0 Standardization Association [7] - Alibaba launched the AgentScope 1.0 framework for multi-agent development, providing a comprehensive solution for the entire lifecycle of intelligent applications [8] - Tencent announced the open-sourcing of the Youtu-Agent framework, which does not require additional model training and is based entirely on the open-source ecosystem [9] - Tencent released the HunyuanWorld-Voyager model, the first to support native 3D reconstruction for virtual reality and gaming applications [10] - Digital China is expanding into the robotics industry and has formed partnerships with leading companies in the field [11] - ByteDance plans to issue stock options to its Seed department, focusing on large model technology personnel [12] - Douyin established a new company focused on AI applications in healthcare, with a registered capital of 100,000 yuan [13] Financing and Mergers - Obita completed over 10 million USD in angel round financing, with funds aimed at core system development and market expansion [25] - New Unisplendour Group established a high-tech company with a registered capital of 10 million yuan, focusing on integrated circuit design and sales [26] - Ant Group's subsidiary invested in Xinyuan Semiconductor, increasing its registered capital from approximately 46.5 million yuan to about 50.3 million yuan [27] - AI company Anthropic raised 13 billion USD in a new funding round, increasing its valuation to 183 billion USD [28] - OpenAI agreed to acquire product testing startup Statsig for 1.1 billion USD in stock, marking one of its largest acquisitions [29] Policies and Trends - The National Development and Reform Commission plans to issue "AI vouchers" to promote the use of intelligent terminals and reduce R&D costs [33] - The Ministry of Industry and Information Technology announced plans to support the development of high-performance AI training and inference chips [44] - The Ministry of Industry and Information Technology reported that software business revenue reached 83.246 billion yuan in the first seven months, a year-on-year increase of 12.3% [41] - The Ministry of Industry and Information Technology indicated that internet enterprises achieved a total profit of 93.88 billion yuan in the first seven months, a year-on-year decrease of 1.8% [42] - Shanghai is organizing the 2025 "AI+" action project application, focusing on enhancing AI capabilities and promoting industry development [45] - The National Standards Committee plans to revise over 4,000 national standards in fields such as AI and IoT [46]
首个具身智能大规模强化学习框架RLinf开源 无问芯穹联合清华等机构打造
Bei Jing Shang Bao· 2025-09-01 05:05
(文章来源:北京商报) 据了解,RLinf名称中"inf"既代表"infrastructure"(基础设施),也寓意"infinite"(无限扩展),核心解 决当前框架对具身智能支持受限的问题。相比纯推理大模型,具身智能需兼顾"大脑"(推理规划) 与"小脑"(执行操作),且存在"渲训推一体化"特性,对算力、显存及框架灵活性要求更高。RLinf通 过六大层级(用户层、任务层、执行层、调度层、通信层、硬件层)设计,针对性突破技术难点。 北京商报讯2025年9月1日,无问芯穹官方公众号发布消息称,公司联合清华大学、北京中关村学院,并 携手北京大学、加州大学伯克利分校等机构,正式开源首个面向具身智能的"渲训推一体化"大规模强化 学习框架RLinf,为人工智能从"感知"向"行动"跨越提供关键技术支撑。 ...
首个具身智能大规模强化学习框架RLinf开源,无问芯穹联合清华等机构打造
Bei Jing Shang Bao· 2025-09-01 04:49
Core Insights - The company announced the launch of the RLinf framework, a large-scale reinforcement learning framework aimed at embodied intelligence, in collaboration with several prestigious institutions [1] Group 1: Framework Overview - RLinf stands for "reinforcement learning infrastructure" and symbolizes "infinite" scalability, addressing limitations in current frameworks for supporting embodied intelligence [1] - The framework integrates "perception" and "action," requiring a balance between reasoning (brain) and execution (cerebellum), which imposes higher demands on computing power, memory, and framework flexibility compared to pure inference models [1] Group 2: Technical Design - RLinf is designed with six layers: user layer, task layer, execution layer, scheduling layer, communication layer, and hardware layer, specifically targeting technical challenges in the field [1]
RLinf开源!首个面向具身智能“渲训推一体化”的大规模强化学习框架
具身智能之心· 2025-09-01 04:02
Core Viewpoint - The article discusses the launch of RLinf, a large-scale reinforcement learning framework aimed at embodied intelligence, highlighting its innovative design and capabilities in enhancing AI's transition from perception to action [2][5]. Group 1: Framework Overview - RLinf is a flexible and scalable framework designed for embodied intelligence, integrating various components to optimize performance [5]. - The framework's name "inf" signifies both "infrastructure" and "infinite" scaling, emphasizing its adaptable system design [7]. - RLinf features a hybrid execution model that achieves over 120% system speedup compared to traditional frameworks, with VLA model performance improvements of 40%-60% [7][12]. Group 2: Execution Modes - RLinf supports three execution modes: Collocated, Disaggregated, and Hybrid, allowing users to configure components based on their needs [17][15]. - The hybrid mode combines the advantages of both shared and separated execution, minimizing system idle time and enhancing efficiency [12][15]. Group 3: Communication and Scheduling - The framework includes an adaptive communication library designed for reinforcement learning, optimizing data exchange between components [19][22]. - RLinf features an automated scheduling module that minimizes resource idleness and dynamically adjusts to user training flows, achieving rapid scaling capabilities [23][24]. Group 4: Performance Metrics - RLinf has demonstrated significant performance improvements in embodied intelligence tasks, achieving success rates of 80%-90% in specific scenarios, compared to 30%-50% in previous models [24][26]. - The framework has also achieved state-of-the-art (SOTA) performance in mathematical reasoning tasks across multiple datasets, showcasing its versatility [29][30]. Group 5: Documentation and Community Engagement - Comprehensive documentation and API support are provided to enhance user experience and facilitate understanding of the framework [32][34]. - The RLinf team encourages collaboration and invites users to explore the framework, highlighting ongoing recruitment for various research and engineering positions [33][34].
首个为具身智能而生的大规模强化学习框架RLinf!清华、北京中关村学院、无问芯穹等重磅开源
机器之心· 2025-09-01 02:49
Core Viewpoint - The article discusses the launch of RLinf, a large-scale reinforcement learning framework designed for embodied intelligence, emphasizing its flexible and scalable architecture that integrates training, rendering, and inference processes [5][7]. Group 1: Development of RL Framework - The transition in artificial intelligence from "perception" to "action" highlights the importance of embodied intelligence, which is gaining attention in both academia and industry [2][4]. - RLinf is developed collaboratively by Tsinghua University, Beijing Zhongguancun College, and Wuwenchin, aiming to address the limitations of existing frameworks in supporting embodied intelligence [5][7]. Group 2: Features of RLinf - RLinf's architecture consists of six layers: user layer, task layer, execution layer, scheduling layer, communication layer, and hardware layer, allowing for a hybrid execution mode that achieves over 120% system speedup [7][12]. - The framework introduces a Macro-to-Micro Flow (M2Flow) mechanism, enabling flexible construction of training processes while maintaining high programming flexibility and debugging ease [14][15]. Group 3: Execution Modes - RLinf supports three execution modes: Collocated Mode, Disaggregated Mode, and Hybrid Mode, allowing users to configure components for optimal resource utilization [19][20]. - The framework integrates low-intrusion multi-backend solutions to cater to the diverse needs of researchers in the embodied intelligence field [16][20]. Group 4: Communication and Scheduling - RLinf features an adaptive communication library designed for reinforcement learning, optimizing data exchange between components to enhance system efficiency [22][28]. - An automated scheduling module minimizes resource idling by analyzing component performance and selecting the best execution mode, significantly improving training stability [24][25]. Group 5: Performance Metrics - RLinf demonstrates superior performance in embodied intelligence tasks, achieving over 120% efficiency improvement compared to existing frameworks in specific tests [27][33]. - The framework has shown significant success rate improvements in various tasks, with models achieving up to 97.3% success rates in specific scenarios [31][35]. Group 6: Future Development and Community Engagement - The RLinf team emphasizes open-source principles, providing comprehensive documentation and support to enhance user experience and facilitate collaboration [40][41]. - The team is actively recruiting for various positions to further develop and maintain the RLinf framework, inviting community engagement and feedback [42][43].