Workflow
具身智能之心
icon
Search documents
为什么RL在人形/四足/机械臂等本体上依然还有很多工作可以做?
具身智能之心· 2025-10-28 04:00
Core Insights - Reinforcement Learning (RL) remains a significant field, with increasing applications in robotics, including humanoid and quadruped robots, as well as in product optimization across various industries [1][2][3] - The complexity of RL poses challenges for newcomers, making it difficult to produce publishable research papers without a structured learning system [5][9] - To address these challenges, a specialized 1v6 mentoring course in RL has been launched, aimed at helping students produce quality research papers [6][9] Group 1: Importance of Reinforcement Learning - RL is crucial for tasks such as gait control in embodied intelligent robots, which is essential for achieving general-purpose capabilities [2] - Companies like Yushun and Zhiyuan utilize RL for humanoid robots to perform complex actions like climbing stairs, running, and dancing, enhancing their adaptability in various scenarios [2][8] - The integration of RL with Variable Length Action (VLA) in robotic arms is gaining popularity in academia, leading to more efficient and smooth robot operations [3][8] Group 2: Challenges in Learning and Research - The vast and intricate nature of RL makes it difficult for beginners to find a clear entry point, often resulting in frustration and abandonment of learning [5][9] - Producing a research paper that meets the standards of peer review requires proficiency in methodology, experimental results, and writing style, which can be overwhelming for newcomers [5][9] Group 3: Course Offerings and Structure - The 1v6 mentoring course is designed for graduate students and others seeking guidance on research papers, featuring small class sizes and weekly live sessions [7][9] - The course spans 14 weeks of intensive online training followed by 8 weeks of maintenance support, focusing on various aspects of RL and its applications in robotics [9][15] - Participants will receive guidance on paper ideas, project implementation, experimental support, and writing refinement, with the goal of producing a draft suitable for submission to top conferences [7][9][15] Group 4: Course Content and Deliverables - The curriculum includes topics such as RL fundamentals, simulation environments, and specific applications in quadruped, humanoid, and robotic arm training [17][19] - Students will engage in hands-on projects, culminating in a research paper draft that adheres to the requirements of conferences like RAL, ICRA, IROS, and CoRL [23][24] - The course emphasizes a structured approach to research, covering the entire process from methodology to writing and submission [30]
SFT 还是RL,VLA到底应该如何训练?
具身智能之心· 2025-10-28 00:02
Core Insights - The articles focus on advancements in Reinforcement Learning (RL) and its application to Visual-Language-Action (VLA) models, highlighting significant improvements in generalization capabilities and training efficiency. Group 1: Research Findings - The first study investigates how RL enhances the generalization ability of VLA models, addressing issues related to supervised fine-tuning (SFT) that lead to error accumulation and distribution shift. A new benchmark covering visual, semantic, and execution dimensions was established, showing that using Proximal Policy Optimization (PPO) for RL fine-tuning significantly improves semantic understanding and execution robustness while maintaining comparable visual generalization performance to SFT [2]. - The second study introduces RLinf-VLA, a framework designed for large-scale RL training of VLA models. It proposes a novel solution to the challenges of integrating RL and VLA training, achieving up to 2.27 times acceleration compared to baseline methods. The framework supports various VLA architectures and RL algorithms, achieving a 98.11% success rate across 130 LIBERO tasks [3]. Group 2: Practical Applications - RLinf-VLA summarizes best practices for applying RL in VLA training, providing a unified interface that facilitates the use of multiple VLA architectures and simulators, thus lowering the barrier for implementing RL in large-scale VLA applications [3]. - The research emphasizes the importance of RL in enhancing the performance of VLA models, suggesting a shift towards more efficient training methodologies that leverage RL's strengths [15].
无人机也能打排球吗?清华团队用强化学习探了探路
具身智能之心· 2025-10-28 00:02
Core Insights - The article discusses a new embodied AI task proposed by Tsinghua University, focusing on "multi-drone volleyball," which aims to enhance the capabilities of drones in a three-dimensional space through teamwork and strategy [1][2]. Group 1: Task Overview - The "multi-drone volleyball" task requires drones to demonstrate high maneuverability and precise control while collaborating as a team to hit a ball over a net and compete against opposing teams [2]. - The Tsinghua team has developed the VolleyBots testing platform to simulate the human learning process in volleyball, incorporating various tasks for single and multiple drones [2][6]. Group 2: Algorithm Development - The Hierarchical Co-Self-Play (HCSP) algorithm was designed to enable drones to learn cooperation, division of roles, and offensive/defensive transitions through hierarchical strategy learning and self-play mechanisms [2][12]. - The research incorporated various reinforcement learning and game-theoretic algorithms, with the HCSP showing an average win rate of 82.9% against multiple baseline algorithms [15]. Group 3: Training Phases - The training process consists of three phases: low-level skill learning, high-level strategy game playing, and collaborative self-play, allowing drones to evolve their strategies and skills in a competitive environment [14]. - The drones demonstrated the ability to form clear roles during matches, such as defense, passing, and offense, and even developed new tactics like "setter's lob" during training [15]. Group 4: Real-World Application - The JuggleRL system was introduced to enable drones to perform continuous juggling in the real world, achieving a record of 462 consecutive juggles without any real data fine-tuning [16][18]. - This achievement marks a significant step in embodied reinforcement learning, transitioning from virtual environments to real physical interactions [18][19].
社区内的同学陆续出offer了......
具身智能之心· 2025-10-28 00:02
Core Insights - The article highlights the successful job placements of community members in various leading companies and emphasizes the importance of choosing top-tier firms or unique tech unicorns for career advancement [1] - The community aims to foster talent in the field of embodied intelligence through various initiatives, including technical sharing, job referrals, and industry engagement [1][2][5] Group 1: Community Initiatives - Continuous live sharing sessions are organized to discuss the latest developments and unresolved issues in the embodied intelligence industry [2] - A comprehensive technical roadmap has been developed for beginners, providing essential knowledge and skills for entering the field [3] - Valuable industry frameworks and project proposals are offered to those already engaged in related research [5] Group 2: Job Referrals and Networking - The community has established a job referral mechanism with multiple embodied intelligence companies, facilitating direct connections between job seekers and employers [7] - Members can access a wealth of resources, including open-source projects, datasets, and simulation platforms, to enhance their learning and practical skills [9][25][33] Group 3: Educational Resources - The community provides a compilation of renowned domestic and international laboratories in embodied intelligence, aiding members in their academic pursuits [12] - A collection of research reports related to large models and humanoid robots is available, keeping members informed about industry trends and applications [18] - Members can access a variety of educational materials, including books and technical documents, to support their foundational learning in robotics [20][21] Group 4: Specialized Learning Paths - Detailed learning paths for embodied intelligence perception and interaction are outlined, covering various tasks and methodologies [38][40] - The community offers insights into cutting-edge topics such as multi-modal large models and reinforcement learning, ensuring members stay updated with the latest advancements [46][53]
Efficiency Law, 世界模型引擎驱动的具身智能学习新范式
具身智能之心· 2025-10-28 00:02
Core Insights - The article emphasizes the importance of addressing data generation issues in the field of embodied intelligence, highlighting that the previously overlooked data problems are fundamental to the successful implementation of this technology [2][5]. Group 1: Efficiency Law and Scaling Law - The article introduces the concept of "Efficiency Law," which is derived from the limitations of the "Scaling Law" in embodied intelligence. The Efficiency Law posits that the performance of embodied models is significantly influenced by the rate of high-quality data generation (r_D) within a limited timeframe [5][6]. - It is stated that a higher data generation rate (r_D) can enhance learning efficiency, while a lower rate leads to a "data scarcity zone," hindering model performance [6][20]. Group 2: World Models and Physical Accuracy - The necessity for absolute physical accuracy in world models is discussed, as embodied intelligence relies on understanding real-world physics to execute actions effectively. Models must adhere to physical laws to ensure reliable learning and decision-making [9][12]. - Current video-based world models are criticized for lacking physical correctness, as they primarily focus on visual realism rather than accurately simulating physical dynamics [8][12]. Group 3: GS-World and Its Applications - The GS-World model is presented as a novel approach that integrates generative models with physical simulation engines, allowing for the generation of physically accurate environments and interactions. This model addresses the shortcomings of traditional video-based models [11][13]. - GS-World is positioned as a transformative engine for embodied intelligence, enabling the autonomous generation of training data and facilitating high-fidelity strategy validation in simulated environments [15][20]. Group 4: Engine-Driven Learning Paradigm - The article outlines a shift from data-driven to engine-driven learning paradigms in embodied intelligence, where the GS-World engine allows for continuous interaction and feedback, fostering a self-evolving learning system [24][25]. - This new paradigm emphasizes the importance of generating and simulating physical worlds, enabling agents to learn and adapt through real-time interactions rather than relying solely on historical data [24][28]. Group 5: Robustness and Generalization - The need for embodied intelligence systems to achieve product-level success rates and robustness against environmental disturbances is highlighted. The engine-driven learning paradigm is deemed essential for developing reliable and trustworthy intelligent products [27][29]. - The GS-World model is described as a critical platform for evolving robotic skills, allowing for the natural emergence of skills through interaction within a physically accurate simulated environment [31][32].
征和工业:灵巧手的“阿喀琉斯之踵” | 微链技术如何破解传动系统的“不可能三角”
具身智能之心· 2025-10-27 04:00
传动系统 微链优势 导 语 类人机器人技术的兴起代表着自动化领域最有前景的前沿之一,其应用涵盖制造业、医疗保 健、服务业和人类辅助。 随着企业和行业寻求整合更先进的机器人解决方案,驱动机器人肢体的传动系统已成为关键 的差异化因素。 在各种可用的驱动技术中,微链系统正在成为一种重要解决方案,解决了现代灵巧机器人 手、臂、腿所面临的根本挑战。 装配微链驱动系统的类人机器人 展示精密的机械结构(机器人结构构想图) 01 REPORT 多元矛盾集成与市场痛点 传统驱动系统的不足之处 当今企业既要求能执行灵巧操作,又希望保持高可靠性、成本效益和运营效率,即"可靠性-性能- 成本"不可能三角。传统灵巧手驱动系统面临几个关键限制,影响了其实际性能和商业可行性。 可靠性——"停机一小时,损失十万元": 制造业产线上,灵巧手故障会令整条产线停摆。某汽车零部件工厂主管指出:"产线每小时产值 12万元,灵巧手停机两小时,直接损失就是24万元,还不算后续的延期赔偿。"企业真正需要的 是百万次以上的循环寿命。 一致性——批量生产和AI训练的前提 : 灵巧手商业价值在于泛化性,但现实中同批次产品性能差异较大,导致大规模部署、AI模型训练 ...
智源&悉尼大学等出品!RoboGhost:文本到动作控制,幽灵般无形驱动人形机器人
具身智能之心· 2025-10-27 00:02
Core Insights - The article discusses the development of RoboGhost, an innovative humanoid control system that eliminates the need for motion retargeting, allowing for direct action generation from language input [6][8][14]. Group 1: Research Pain Points - The transition from 3D digital humans to humanoid robots faces challenges due to the cumbersome and unreliable multi-stage processes involved in language-driven motion generation [6][7]. - Existing methods lead to cumulative errors, high latency, and weak coupling between semantics and control, necessitating a more direct path from language to action [7]. Group 2: Technical Breakthrough - RoboGhost proposes a retargeting-free approach that directly establishes humanoid robot strategies based on language-driven motion latent representations, treating the task as a generative one rather than a simple mapping [8][10]. - The system utilizes a continuous autoregressive motion generator to ensure long-term motion consistency while balancing stability and diversity in generated actions [8][14]. Group 3: Methodology - The training process consists of two phases: action generation and strategy training, with the former using a continuous autoregressive architecture and the latter employing a mixture-of-experts (MoE) framework to enhance generalization [11][13]. - The strategy training incorporates a diffusion model that uses motion latent representations as conditions to guide the denoising process, allowing for direct executable action generation [11][14]. Group 4: Experimental Results - Comprehensive experiments demonstrated that RoboGhost significantly improves action generation quality, success rates, deployment time, and tracking errors compared to baseline methods [14][15]. - The results indicate that the diffusion-based strategy outperforms traditional multilayer perceptron strategies in terms of tracking performance and robustness, even when tested on unseen motion subsets [18][19].
很多初学者想要的具身科研平台来了,为具身领域打造,高性价比
具身智能之心· 2025-10-27 00:02
面向具身科研领域打造的轻量级高性价比机械臂 还在为具身智能领域的硬件选择发愁吗? 太贵的机械臂买不起,太便宜的又难用、难上手? ✅ 提供全流程开源工具链+代码示例,从数据采集到模型部署一气呵成; ✅ 支持 Python / C++ 双语言接口,无论你擅长哪种语言都能快速上手; ✅ 兼容 ROS1 / ROS2,并提供 URDF 模型,仿真与真机无缝切换; ✅ 24小时快速售后响应,遇到问题不卡壳,学习路上有保障! 该机械臂融合高精度运动控制、低功耗设计与开放软硬件架构,支持从仿真到真机的无缝联调,并提供全 流程开源SDK与工具链,助力用户快速实现算法验证、数据采集、模型训练与部署应用。 其紧凑型结构与模块化接口,尤其适用于嵌入式AI与机器人学习平台的开发与应用推广。 别担心,Imeta-Y1 来了——这是一款专为新手和科研初学者设计的轻量级高性价比机械臂。 无论你是学生、教育工作者,还是刚踏入机器人领域的开发者,Imeta-Y1 都能帮你低成本、高效率地完成 算法验证与项目开发。 对小白尤其友好的是: | 本体重量 | 4.2KG | 额定负载 | 3KG | 自由度 | 6 | | --- | --- | ...
HuggingFace联合牛津大学新教程开源SOTA资源库!
具身智能之心· 2025-10-27 00:02
Core Viewpoint - The article emphasizes the significant advancements in robotics, particularly in robot learning, driven by the development of large models and multi-modal AI technologies, which have transformed traditional robotics into a more learning-based paradigm [3][4]. Group 1: Introduction to Robot Learning - The article introduces a comprehensive tutorial on modern robot learning, covering foundational principles of reinforcement learning and imitation learning, leading to the development of general-purpose, language-conditioned models [4][12]. - HuggingFace and Oxford University researchers have created a valuable resource for newcomers to the field, providing an accessible guide to robot learning [3][4]. Group 2: Classic Robotics - Classic robotics relies on explicit modeling through kinematics and control planning, while learning-based methods utilize deep reinforcement learning and expert demonstration for implicit modeling [15]. - Traditional robotic systems follow a modular pipeline, including perception, state estimation, planning, and control [16]. Group 3: Learning-Based Robotics - Learning-based robotics integrates perception and control more closely, adapts to tasks and entities, and reduces the need for expert modeling [26]. - The tutorial highlights the challenges of safety and efficiency in real-world applications, particularly during the initial training phases, and discusses advanced techniques like simulation training and domain randomization to mitigate risks [34][35]. Group 4: Reinforcement Learning - Reinforcement learning allows robots to autonomously learn optimal behavior strategies through trial and error, showcasing significant potential in various scenarios [28]. - The tutorial discusses the complexity of integrating multiple system components and the limitations of traditional physics-based models, which often oversimplify real-world phenomena [30]. Group 5: Imitation Learning - Imitation learning offers a more direct learning path for robots by replicating expert actions through behavior cloning, avoiding complex reward function designs [41]. - The tutorial addresses challenges such as compound errors and handling multi-modal behaviors in expert demonstrations [41][42]. Group 6: Advanced Techniques in Imitation Learning - The article introduces advanced imitation learning methods based on generative models, such as Action Chunking with Transformers (ACT) and Diffusion Policy, which effectively model multi-modal data [43][45]. - Diffusion Policy demonstrates strong performance in various tasks with minimal demonstration data, requiring only 50-150 demonstrations for training [45]. Group 7: General Robot Policies - The tutorial envisions the development of general robot policies capable of operating across tasks and devices, inspired by large-scale open robot datasets and powerful visual-language models [52][53]. - Two cutting-edge visual-language-action (VLA) models, π₀ and SmolVLA, are highlighted for their ability to understand visual and language instructions and generate precise control commands [53][56]. Group 8: Model Efficiency - SmolVLA represents a trend towards model miniaturization and open-sourcing, achieving high performance with significantly reduced parameter counts and memory consumption compared to π₀ [56][58].
盲人复明!马斯克Neuralink联创实现人工视觉里程碑
具身智能之心· 2025-10-27 00:02
盲人复明 ,太了不起了。 编辑丨 量子位 点击下方 卡片 ,关注" 具身智能之心 "公众号 >> 点击进入→ 具身 智能之心 技术交流群 更多干货,欢迎加入国内首个具身智能全栈学习社区 : 具身智能之心知识星球 (戳我) , 这里包含所有你想要的。 这可能是2025年最低调但又最闪亮的科技进展了。 Nature最新刊登了新研究进展,人工视觉技术刚刚帮助一位70岁奶奶重获光明。 在失明之前,我是个狂热的书虫,我想把它找回来。 70岁的Sheila Irvine (希拉 · 欧文) 最大的愿望是能够再次阅读,而就在最近她的愿望成真了。 原因来自于一项世界首创的人工视觉研究 PRIMA 。 其背后带队的还是当年和马斯克一起创办Neuralink的联合创始人,现在自己创业,做的还是 视网膜植入物 。 厚度只有一根头发丝大小,却能够让 80% 的患者视力得到显著改善,并且能够顺利阅读字母、数字和单词。 对此,论文主要作者Frank Holz表示: 该研究首次证明人工视觉可以恢复患者的功能性中央视力,为失明者带来了希望。 而对于患者本身及其家人,或许这将是人至暮年,一次宝贵的再次见面的机会: 失明15年,终于重获光明的她 ...