Workflow
具身智能之心
icon
Search documents
相约杭州!具身智能之心首次赞助IROS并现场颁奖
具身智能之心· 2025-10-21 01:30
Core Viewpoint - The RoboSense Challenge 2025 aims to systematically evaluate the perception and understanding capabilities of robots in real-world scenarios, addressing the challenges posed by traditional perception algorithms in complex environments [1]. Group 1: Event Overview - The challenge is organized by multiple prestigious institutions, including the National University of Singapore, Nanyang Technological University, and the University of Michigan, among others [4][5]. - It is officially recognized as a competition during the IROS 2025 conference, which will take place in Hangzhou, China [5]. Group 2: Challenge Objectives - The primary goal is to develop socially intelligent autonomous navigation robots that can navigate safely and efficiently in dynamic indoor environments without disrupting human activities [8][10]. - The challenge focuses on creating a perception and navigation system based on RGBD vision and odometry, requiring robots to operate without maps or privileged information [9]. Group 3: Challenge Difficulties - Key challenges include dynamic behavior modeling, social rule encoding, and uncertainty handling in unpredictable environments [12]. - Evaluation metrics will not only consider success rates and path efficiency but also social compliance indicators and collision statistics [12]. Group 4: Recommended Directions - Suggested approaches include using transformer-based social trajectory prediction modules, behavior classifiers for risk assessment, and graph neural networks for multi-target structural modeling [15].
开源对机器人的价值,远超想象丨唐文斌深度对谈抱抱脸联创
具身智能之心· 2025-10-21 00:03
Core Insights - The article discusses the challenges in the field of robotics, particularly the gap between simulation and real-world application, and introduces RoboChallenge.ai as a solution to create a standardized evaluation platform for embodied intelligence [2][42][51]. Group 1: Current Challenges in Robotics - Many models perform well in simulations but fail in real-world scenarios, highlighting a significant pain point in robotics research [2][42]. - The need for a unified, open, and reproducible evaluation system for robotics is emphasized, as current benchmarks are primarily based on simulations [50][44]. Group 2: Introduction of RoboChallenge.ai - RoboChallenge.ai is launched as an open, standardized platform for evaluating robotic models in real-world environments, allowing researchers to remotely test their models on physical robots [6][51]. - The platform enables users to control local models through an API, facilitating remote testing without the need to upload models [8][53]. Group 3: Importance of Open Source in Robotics - Open source is identified as a crucial driver for advancements in AI and robotics, enabling collaboration and innovation across global teams [10][19]. - The article argues that open source in robotics may be even more critical than in large language models (LLMs) due to the necessity of hardware accessibility for model application [20][22]. Group 4: Future Directions and Community Involvement - The article anticipates that the next three to five years will see significant evolution in embodied intelligence research, with robots capable of executing longer and more complex tasks [82]. - Community participation is encouraged, with the expectation that diverse contributions will enhance data availability and model robustness [66][68].
无需再训练!港大团队提出GPC框架,实现机器人「策略组合」
具身智能之心· 2025-10-21 00:03
编辑丨 机器之心 点击下方 卡片 ,关注" 具身智能之心 "公众号 >> 点击进入→ 具身 智能之心 技术交流群 更多干货,欢迎加入国内首个具身智能全栈学习社区 : 具身智能之心知识星球 (戳我) , 这里包含所有你想要的。 本文一作曹嘉航,香港大学在读博士生,前北京人形机器人创新中心实习生;共同一作黄翊泽,上海交通大学在读本科生;通讯导师 Andrew F. Luo,香港大学助 理教授。 在机器人学习领域,提升基于生成式模型的控制策略(Policy)的性能通常意味着投入巨额成本进行额外的数据采集和模型训练,这极大地限制了机器人能力的快 速迭代与升级。面对模型性能的瓶颈,如何在不增加训练负担的情况下,进一步挖掘并增强现有策略的潜力? 香港大学团队开创性地提出了 GPC(General Policy Composition,通用策略组合) 框架,为这一挑战提供了全新的免训练解决方案。该框架通过在测试时(test- time)对多个预训练模型进行 "策略组合",能够创造出一个性能超越任何单一父策略的 "组合策略"。 GPC 作为一个 "即插即用" 的通用框架,能够灵活融合不同架构(如 Diffusion-base ...
告别 “专家垄断”!AdaMoE 破解 VLA 模型效率与精度两难问题
具身智能之心· 2025-10-21 00:03
Core Viewpoint - The article discusses the AdaMoE architecture, which enhances the performance of Vision-Language-Action (VLA) models in robotic control by decoupling expert selection and weight distribution, leading to improved success rates in both simulation and real-world tasks [1][24]. Summary by Sections Research Background: The Three Dilemmas of VLA Models - Traditional VLA models face three main dilemmas: 1. Difficulty in improving performance due to high training costs, as collecting precise robotic data is resource-intensive [2]. 2. The challenge of real-time control, where dense models require all parameters to be activated, slowing down response times [3]. 3. The inefficiency of using Mixture of Experts (MoE) due to conflicts among experts, which hinders effective task execution [5]. Core Design: The Decoupling Magic of AdaMoE - AdaMoE's innovation lies in its ability to separate the roles of expert selection and performance evaluation, allowing each component to focus on its strengths rather than trying to solve all problems simultaneously [6]. Key Designs of AdaMoE - **Design 1**: Utilizes pre-trained weights to significantly reduce training costs by focusing on fine-tuning specialized skills rather than relearning basic actions [8]. - **Design 2**: Implements "sparse activation" and dual-module decoupling to balance capacity and efficiency while preventing conflicts among experts [9][10]. Key Findings: Advantages of Decoupling - The research team conducted extensive experiments revealing four key conclusions that highlight the superiority of AdaMoE: 1. Experts can effectively specialize in their tasks without interference, leading to improved performance [13]. 2. Decoupling responsibilities enhances performance compared to traditional coupling methods [15]. 3. Fewer, more specialized experts yield better results than a larger number of overlapping experts [19]. 4. Real-world scenarios benefit more from decoupling than simulated environments, with significant improvements in task success rates [22]. Experimental Results: Validation of AdaMoE - AdaMoE demonstrated superior performance across various benchmarks, achieving an average success rate of 96.0%, outperforming traditional models and other architectures [23]. Conclusion: The Breakthrough Significance of AdaMoE - AdaMoE not only improves performance but also provides a pathway for VLA models to operate effectively without excessive resource demands, emphasizing the importance of clear task specialization for both robots and humans [24][26].
最后1个名额!强化学习在人形/四足/机械臂等方向上的应用
具身智能之心· 2025-10-21 00:03
Core Insights - Reinforcement Learning (RL) remains a significant field, with increasing applications in robotics, including humanoid and quadrupedal robots, as well as in product optimization across various industries [1][2][3] - The complexity of RL poses challenges for newcomers, making it difficult to produce publishable research papers without a structured learning system [5][6][9] Group 1: Importance of Reinforcement Learning - RL is crucial for tasks such as gait control in embodied intelligent robots, which is essential for achieving general-purpose capabilities [2] - Companies like Yushun and Zhiyuan utilize RL for humanoid robots to perform complex actions like climbing stairs, running, and dancing, enabling applications in rescue and hazardous environments [2][8] Group 2: Challenges in Learning and Research - The extensive and intricate nature of RL makes it hard for beginners to enter the field, often leading to frustration and abandonment of learning [5][9] - Producing a paper that meets the standards of peer review requires proficiency in methodology, experimental results, and writing, with any misstep potentially resulting in low scores from reviewers [5][6] Group 3: Educational Initiatives - To address the entry barriers in RL research, a specialized 1v6 mentoring course has been launched, targeting graduate students and others needing guidance in paper writing [6][7] - The course includes weekly live sessions, project implementation, experimental guidance, and writing refinement, aiming to help participants produce a draft suitable for submission to top conferences and journals [7][9][15] Group 4: Course Structure and Content - The course spans 14 weeks of intensive online training followed by 8 weeks of maintenance support, focusing on various aspects of RL and robotics [9][15] - Key topics include foundational RL concepts, simulation environments, sim2real techniques, and writing guidance, with a structured approach to ensure participants achieve measurable milestones [15][19][20]
原力灵机提出ManiAgent!会 “动手”,会 “思考”,还会“采数据”!
具身智能之心· 2025-10-20 10:00
点击下方 卡片 ,关注" 具身智能 之心 "公众号 在机器人操作领域,Vision-Language-Action(VLA)模型虽已展现出一定技术潜力,但其在 复杂推理 与 长程任务规划 场景下的性能,仍受限于数据稀缺与模型 容量两大核心问题。为此,我们提出了 ManiAgent —— 一种面向通用机器人操作任务的智能体架构,该架构可实现从 任务描述、环境输入 到 机器人操作动作 的 端到端输出 。 在 ManiAgent 框架中,多个智能体通过协同交互分别承担环境感知、子任务分解与动作生成功能,能够高效应对复杂操作场景。我们通过实验评估发现, ManiAgent 在 SimplerEnv 基准测试中的任务成功率达 86.8%, 在 真实世界拾取 - 放置任务 中的成功率更高达 95.8%。 值得注意的是,依托其高任务成功率, ManiAgent 还可作为 高效数据采集工具 ,基于该工具获取的训练数据所构建的 VLA 模型,性能能够与基于人工标注数据集训练的 VLA 模型相媲美,这为机器 人操作领域的技术优化与落地提供了重要支撑。 图1: ManiAgent的整体工作流程示例 论文题目:ManiAgent: ...
具身智能之心交流群成立来!VLA/RL/导航/数采等多个方向
具身智能之心· 2025-10-20 10:00
Group 1 - The establishment of a technical exchange group focused on embodied intelligence has been announced, inviting participation from various stakeholders in the field [1] - The group encompasses nearly 20 sub-directions, indicating a broad scope of interest and expertise within the embodied intelligence domain [1] - Participants are encouraged to engage in discussions related to humanoid robots, quadrupeds, robotic arms, and various advanced technologies such as VLA, large models, VLN, reinforcement learning, mobile operations, multi-modal perception, simulation, and data collection [1]
我们的具身社区,最近又增加了很多模块~
具身智能之心· 2025-10-20 03:29
增加了很多模块,我们的具身智能社区进一步完善了! 9月和10月一直在补充我们的具身社区版块,重点增加了VLA、real2sim2real、移动操作、世界模 型、域适应等任务,当然还有很多高质量的直播。 除此之外,目前正在给大家更新一些开源的方案与硬件,后期我们期望能在这些方案的基础上做一 些分享,让每个同学都能完成自己的project。 近一年的搭建,我社区内已经完成了技术路线分享、直播、问答、求职、赛事等多个版块的分享。 实现了产业、学术、求职、问答交流等多个领域的闭环。 1)持续的直播分享 社区为大家准备了很多圆桌论坛、直播,从本体、数据到算法,各类各样,逐步为大家分享具身行 业究竟在发生什么?还有哪些问题待解决。 2)完整的技术路线 针对入门者,我们整理了许多为小白入门的技术栈和路线。 3)产业&项目相关的方案 已经从事相关研究的同学,我们也给大家提供了很多有价值的产业体系和项目方案。 4)内推与求职 星球还和多家具身公司建立了岗位内推机制,欢迎大家随时艾特我们。第一时间将您的简历送到心 仪公司的手上。 **更有料的是:无论你是要找benchmark、还是要找综述和学习入门路线,都能极大缩短检索时间。 ...
MuJoCo教程来啦!从0基础到强化学习,再到sim2real
具身智能之心· 2025-10-20 00:03
Core Insights - The article emphasizes that the field of AI is at a pivotal moment, transitioning from early symbolic reasoning to deep learning breakthroughs and now to the rise of embodied intelligence, which is redefining human-machine relationships [1][3]. Group 1: Embodied Intelligence - Embodied intelligence is characterized by machines that can understand language commands, navigate complex environments, and make intelligent decisions in real-time, moving beyond the realm of virtual space [1]. - Major tech companies like Tesla, Boston Dynamics, OpenAI, and Google are actively developing technologies in this disruptive field, indicating a competitive landscape [1][3]. - The potential impact of embodied intelligence spans across various industries, including manufacturing, healthcare, and space exploration, suggesting a transformative effect on the economy and society [1]. Group 2: Technical Challenges and Solutions - Achieving true embodied intelligence presents unprecedented technical challenges, requiring advancements in algorithms, physical simulation, robot control, and perception fusion [3]. - MuJoCo (Multi-Joint dynamics with Contact) is highlighted as a critical technology for embodied intelligence, serving as a high-fidelity simulation engine that connects virtual and real-world environments [4][6]. - MuJoCo allows researchers to conduct millions of trials in a simulated environment, significantly accelerating the learning process while minimizing risks associated with physical hardware [6][8]. Group 3: MuJoCo's Advantages - MuJoCo's advanced contact dynamics algorithms enable precise simulation of complex interactions between robots and their environments, making it a standard tool in both academia and industry [4][8]. - The engine supports high parallelization, allowing thousands of simulations to run simultaneously, which enhances efficiency in training AI systems [4][6]. - The technology's stability and numerical accuracy ensure reliable long-term simulations, making it a preferred choice for leading tech companies [4][6]. Group 4: Educational Initiatives - A comprehensive MuJoCo development tutorial has been created, focusing on practical applications and theoretical foundations within the context of embodied intelligence [9][11]. - The course is structured into six modules, each with specific learning objectives and practical projects, ensuring a thorough understanding of the technology stack [15][17]. - Participants will engage in hands-on projects that cover a range of applications, from basic robotic arm control to complex multi-agent systems, fostering both theoretical knowledge and practical skills [19][29]. Group 5: Target Audience and Outcomes - The course is designed for individuals with programming or algorithm backgrounds looking to enter the field of embodied robotics, as well as students and professionals seeking to enhance their practical capabilities [32][33]. - Upon completion, participants will possess a complete skill set in embodied intelligence, including proficiency in MuJoCo, reinforcement learning, and real-world application of simulation techniques [32][33]. - The program aims to cultivate a combination of technical, engineering, and innovative skills, preparing participants to tackle complex problems in the field [33].
稳定训练、数据高效,清华大学提出「流策略」强化学习新方法SAC Flow
具身智能之心· 2025-10-20 00:03
Core Viewpoint - The article introduces a new approach called SAC Flow, which utilizes a high data efficiency reinforcement learning algorithm to train flow-based policies end-to-end without the need for alternative objectives or policy distillation. This method achieves high data efficiency and state-of-the-art performance on various benchmarks [1][4][20]. Group 1: Research Background - Flow-based policies are gaining popularity in the field of robotic learning due to their ability to model multi-modal action distributions and their simplicity compared to diffusion strategies. They are widely used in advanced VLA models [4]. - Previous attempts to train flow policies using off-policy reinforcement learning (RL) often faced issues such as gradient explosion due to the multi-step sampling process inherent in flow policies [4][5]. Group 2: Methodology - The proposed SAC Flow treats flow policies as sequential models, allowing the use of modern recurrent structures like GRU and Transformer to stabilize training and optimize flow policies directly within an off-policy framework [7][10]. - SAC Flow incorporates Gaussian noise and drift correction in each rollout to ensure the end action distribution remains unchanged, allowing the actor/critic loss to be expressed using the log-likelihood of multi-step sampling from the flow policy [14]. Group 3: Training Paradigms - Two training paradigms are supported: - From-scratch training for dense-reward tasks, where SAC Flow can be trained directly [18]. - Offline-to-online training for sparse-reward tasks, where pre-training on a dataset is followed by online fine-tuning [18][20]. Group 4: Experimental Results - SAC Flow-T and Flow-G demonstrated stable and faster convergence in environments like Hopper, Walker2D, and Ant, achieving state-of-the-art performance [20][21]. - The offline-to-online training results showed that SAC Flow maintains stable gradients and prevents gradient explosion, leading to superior performance compared to naive SAC training [24][26]. Group 5: Comparison with Similar Works - SAC Flow outperforms existing methods like FlowRL and diffusion strategies in terms of convergence speed and efficiency, particularly in challenging sparse-reward tasks [30][31]. - The method retains the modeling capabilities of flow policies without the need for distillation into single-step models, which is a common approach in other methods [31]. Group 6: Key Takeaways - The key attributes of SAC Flow are serialization, stable training, and data efficiency, enabling the direct use of off-policy RL algorithms to train flow policies effectively [32].