具身智能之心
Search documents
稳定训练、数据高效,清华大学提出「流策略」强化学习新方法SAC Flow
具身智能之心· 2025-10-20 00:03
Core Viewpoint - The article introduces a new approach called SAC Flow, which utilizes a high data efficiency reinforcement learning algorithm to train flow-based policies end-to-end without the need for alternative objectives or policy distillation. This method achieves high data efficiency and state-of-the-art performance on various benchmarks [1][4][20]. Group 1: Research Background - Flow-based policies are gaining popularity in the field of robotic learning due to their ability to model multi-modal action distributions and their simplicity compared to diffusion strategies. They are widely used in advanced VLA models [4]. - Previous attempts to train flow policies using off-policy reinforcement learning (RL) often faced issues such as gradient explosion due to the multi-step sampling process inherent in flow policies [4][5]. Group 2: Methodology - The proposed SAC Flow treats flow policies as sequential models, allowing the use of modern recurrent structures like GRU and Transformer to stabilize training and optimize flow policies directly within an off-policy framework [7][10]. - SAC Flow incorporates Gaussian noise and drift correction in each rollout to ensure the end action distribution remains unchanged, allowing the actor/critic loss to be expressed using the log-likelihood of multi-step sampling from the flow policy [14]. Group 3: Training Paradigms - Two training paradigms are supported: - From-scratch training for dense-reward tasks, where SAC Flow can be trained directly [18]. - Offline-to-online training for sparse-reward tasks, where pre-training on a dataset is followed by online fine-tuning [18][20]. Group 4: Experimental Results - SAC Flow-T and Flow-G demonstrated stable and faster convergence in environments like Hopper, Walker2D, and Ant, achieving state-of-the-art performance [20][21]. - The offline-to-online training results showed that SAC Flow maintains stable gradients and prevents gradient explosion, leading to superior performance compared to naive SAC training [24][26]. Group 5: Comparison with Similar Works - SAC Flow outperforms existing methods like FlowRL and diffusion strategies in terms of convergence speed and efficiency, particularly in challenging sparse-reward tasks [30][31]. - The method retains the modeling capabilities of flow policies without the need for distillation into single-step models, which is a common approach in other methods [31]. Group 6: Key Takeaways - The key attributes of SAC Flow are serialization, stable training, and data efficiency, enabling the direct use of off-policy RL algorithms to train flow policies effectively [32].
移动操作&双臂操作开源硬件与方案
具身智能之心· 2025-10-20 00:03
Core Viewpoint - The article emphasizes the importance of open-source projects in advancing mobile and dual-arm robotic operations, highlighting their role in breaking down technical barriers and accelerating innovation in various applications, from household robots to industrial automation [3]. Group 1: Open-Source Projects Overview - XLeRobot, developed by Nanyang Technological University, focuses on flexible movement and precise operation in complex environments, providing a reference framework for mobile and dual-arm control [4]. - AhaRobot from Tianjin University emphasizes autonomy and environmental adaptability in dual-arm operations, integrating perception, planning, and control modules for service robots [6]. - ManiGaussian++, released by Tsinghua University, optimizes dual-arm operation accuracy using Gaussian models, particularly in 3D environment perception and motion planning [8]. - H-RDT, a collaboration between Tsinghua University and Horizon Robotics, aims at efficient decision-making and real-time operations for mobile robots in various settings [11]. - RoboTwin 2.0, developed by Shanghai Jiao Tong University and the University of Hong Kong, integrates simulation and physical platforms for mobile and dual-arm operations [14]. - Open X-Embodiment, from Arizona State University, focuses on a generalized learning framework for robotic operations, supporting cross-scenario skill transfer [16]. - 3D FlowMatch Actor, a joint project by Carnegie Mellon University and NVIDIA, enhances dynamic adaptability in 3D space for mobile and dual-arm operations [19]. - OmniH2O, developed by Carnegie Mellon University, focuses on human-robot action mapping and humanoid operation, facilitating remote control and action teaching [24]. - TidyBot++, a collaboration between Princeton University and Stanford University, targets household organization tasks, integrating object recognition and dual-arm collaboration algorithms [27]. - robosuite, from the University of California, Berkeley, is a mature simulation platform for robotic operations, providing standardized tasks and evaluation tools [29]. - SO-ARM100, a standardized dual-arm operation hardware and software solution, aims to lower development barriers for educational and research purposes [32]. - GOAT, developed by UIUC and CMU, focuses on goal-directed movement and operation for robots, emphasizing robustness and versatility [34]. - Mobile ALOHA, from Stanford University, combines mobile chassis and dual-arm operations for low-cost, easily deployable service robots [35].
只需少量演示即可灵活应对多样物体!阿米奥冯骞团队携低成本精准灵巧操作方案亮相IROS!
具身智能之心· 2025-10-20 00:03
点击下方 卡片 ,关注" 具身智能 之心 "公众号 先来看一段视频。 ★ 该项成果的一作为阿米奥联合创始人兼技术负责人冯骞,硕博均就读于德国慕尼黑工业大学,师从机 器人泰斗Alois Knoll,曾是思灵机器人早期员工、研究科学家。本次IROS2025,冯博将会在 Deep Learning in Grasping and Manipulation论坛上针对这项工作发表演讲。 机器人灵巧操作领域研究进展 领域痛点 机器人灵巧操作(如多手指抓取)是实现 "类人机器人" 的关键,但现有方案存在三大核心问题: 当机器人面对陌生物体, 如何靠少量演示、单视角观测就精准抓取? LensDFF ,这项由阿米奥机器人给出 了颠覆性方案—— 它跳出传统 "依赖多视角数据、额外训练对齐网络" 的思路,直接用语言特征作为 "语义 锚点",将 CLIP 提取的 2D 视觉特征,通过动态投影公式对齐到 3D 空间,从根源解决跨视角特征不一致 问题,且全程无需微调。 更关键的是,它把 5 种抓取原语(捏 / 钩 / 三脚架等)融入少样本演示,搭配 "法向量引导初始化 + 低维 eigengrasp 优化",让 DLR-HIT 灵巧手能 ...
端到端基础模型!VCoT-Grasp: 视觉思维链增强的机器人抓取检测大模型
具身智能之心· 2025-10-19 13:50
>> 点击进入→ 具身智能之心 技术交流群 更多干货,欢迎加入国内首个具身智能全栈学习社区 : 具身智能之心知识星球 (戳我) , 这里包含所有你想要的。 点击下方 卡片 ,关注" 具身智能 之心 "公众号 编辑丨具身智能之心 本文只做学术分享,如有侵权,联系删文 思维链 (Chain-of-Thought, CoT) 是一种通过中间思考步骤增强大语言模型推理能力的方法。视觉思维链 (Visual Chain-of-Thought, VCoT) 将思维链从文 本模态扩展到图像模态,以图像作为中间思考步骤,被用来提升多模态大模型的思考能力。 (a)基于多模态融合的方法, (b)使用LLM/VLM提供指引的模块化方法, (c)带有语言推理能力的端到端多模态大模型方法, (d)我们的方法,引入视觉推理能 力,以目标的bounding box图像作为思考步骤。 VCoT-Grasp构建了一个端到端的基础模型,并引入视觉思维链来增强视觉理解能力。实际运行中,模型以目标物品的bounding box图像作为中间思考步 骤,首先预测目标的bounding box作为粗粒度位置,之后目标区域的图像被裁剪并输入模型用以提供细粒 ...
ROSCon China 2025 大会议程公布!
具身智能之心· 2025-10-18 16:03
以下文章来源于古月居 ,作者古月居 古月居 . 专业的ROS机器人知识社区和产业服务平台 编辑丨 古月居 点击下方 卡片 ,关注" 具身智能之心 "公众号 >> 点击进入→ 具身 智能之心 技术交流群 更多干货,欢迎加入国内首个具身智能全栈学习社区 : 具身智能之心知识星球 (戳我) , 这里包含所有你想要的。 2025年机器人领域的"技术盛宴"定档! ROSCon China 2025 将于 10月31日—11月1日 ,在 上海虹桥新华联索菲特大酒店 正式启幕,现在,万众期待的 「完整议程」 终于 重磅公布 ! 无论是想紧跟ROS技术前沿,还是要解决工程落地难题,这场大会都能满足你——两天时间里,核心开发者、产业领袖、资深工程团队将齐聚现场,带来从"技术深 度"到"落地实效"的全维度内容。 谁该来?这几类人一定要占座 • 机器人开发者: 获取ROS最新技术动态,解决开发中的"卡脖子"问题; • 企业技术负责人: 对接产业落地案例,找到适合自身业务的技术方案; • 高校科研人员: 链接行业资源,让研究成果更快对接实际应用; • 机器人爱好者: 近距离接触前沿技术,打开行业认知新视野。 现在议程已公开,席位有限 ...
港科广&清华联合提出Spatial Forcing:隐式空间对齐,超越主流2D/3D VLA模型性能
具身智能之心· 2025-10-18 16:03
Core Insights - The article discusses the limitations of current Vision-Language-Action (VLA) models that primarily rely on 2D visual data, lacking a deep understanding of real 3D space, which hampers their ability to perform tasks in the physical world [2][4] - The proposed method, Spatial Forcing (SF), allows VLA models to develop spatial understanding without explicit 3D input by aligning visual features with a powerful 3D geometric representation generated by an external model [2][10] Methodology - The SF method employs an implicit spatial alignment strategy, enabling the model to autonomously acquire spatial understanding during training without the need for additional 3D sensors [2][13] - A depth probing experiment was conducted to verify the presence of 3D information in the original VLA's visual features, revealing that without 3D input, the model cannot form accurate spatial perceptions [11][13] - The training process involves aligning the VLA model's visual tokens with pixel-level spatial representations extracted from a pre-trained 3D model, optimizing both spatial alignment loss and action generation loss [16] Performance Results - The SF method significantly outperforms existing 2D and 3D VLA models in various tasks, achieving a training efficiency improvement of up to 3.8 times and a data utilization efficiency increase of up to 5.9 times [14] - In experiments, the Spatial Forcing model achieved a success rate of 99.4% in spatial tasks, 99.6% in object tasks, and 98.8% in goal tasks, demonstrating its superior performance compared to other models [18]
从300多篇工作来看, VLA是否为通向通用具身智能的必经之路?
具身智能之心· 2025-10-17 16:02
Core Insights - The emergence of Vision Language Action (VLA) models signifies a shift from traditional strategy-based control to a paradigm of general robotic technology, transforming visual language models (VLM) from passive sequence generators to active agents capable of manipulation and decision-making in complex, dynamic environments [2] Group 1: VLA Overview - The article discusses a comprehensive survey on advanced VLA methods, providing a clear taxonomy and systematic review of existing research [2] - VLA methods are categorized into several main paradigms: autoregressive, diffusion-based, reinforcement-based, hybrid methods, and specialized methods, with detailed examination of their motivations, core strategies, and implementations [2] - The survey integrates insights from over 300 recent studies, outlining the opportunities and challenges that will shape the development of scalable, general VLA methods [2] Group 2: Future Directions and Challenges - The review addresses key challenges and future development directions to advance VLA models and generalizable robotic technologies [2] - The live discussion will explore the origins of VLA, its research subdivisions, and the hot topics and future trends in VLA [5] Group 3: Event Details - The live event is scheduled for October 18, from 19:30 to 20:30, focusing on VLA as a prominent research direction in artificial intelligence [5] - Key highlights of the event include the classification of VLA research fields, the integration of VLA with reinforcement learning, and the Sim2Real concept [6]
穹彻智能获阿里投资,加速具身智能全链路技术突破
具身智能之心· 2025-10-17 08:12
Core Viewpoint - Qunche Intelligent, led by Professor Lu Cewu, a leader in the field of embodied intelligence, combines academic excellence with industry experience, possessing full-stack capabilities from technology research and development to commercial delivery [1] Group 1: Company Overview - Qunche Intelligent focuses on a force-based embodied intelligence brain technology, breaking through traditional trajectory control frameworks [1] - The company has developed a comprehensive autonomous decision-making system covering perception, cognition, planning, and execution [1] - It leverages multimodal large models and a rich accumulation of force perception data to achieve high-dimensional understanding and flexible operation of the physical world [1] Group 2: Recent Developments - Recently, Qunche Intelligent announced the completion of a new round of financing, with Alibaba Group as the investor and several existing shareholders participating [1] - The new funding will be used to accelerate technology product development, implement embodied applications, and expand industry ecosystems [1]
独家|穹彻智能获阿里新一轮融资,上交教授卢策吾领衔,突破无本体数据采集,打通具身智能全链路
具身智能之心· 2025-10-17 07:46
Core Insights - Qunche Intelligent recently completed a new round of financing led by Alibaba Group, with multiple existing shareholders participating. The funds will be used to accelerate technology product development, implement embodied applications, and expand industry ecosystems [2][4]. Group 1: Company Overview - Qunche Intelligent was established at the end of 2023 and has previously completed several rounds of financing totaling hundreds of millions in Pre-A++ and Pre-A+++ rounds [4]. - The company focuses on embodied intelligence technology, rapidly iterating its self-developed large models for the physical world and launched the upgraded product Noematrix Brain 2.0 this year [4][8]. Group 2: Technological Advancements - Qunche Intelligent has made significant breakthroughs in key technology areas, including a no-ontology data collection scheme, a universal end-to-end model scheme, and a large-scale deployment system for human-machine collaboration [4]. - The company aims to streamline the entire process from data collection to deployment, covering the complete technical chain from data acquisition, model pre-training to post-training [4]. Group 3: Market Position and Collaborations - Qunche has established partnerships with several leading companies in the retail and home sectors to promote the mass delivery of integrated hardware and software embodied intelligence solutions [6]. - The company plans to leverage its advanced large model products and data-to-model closed-loop capabilities to continuously provide innovative and practical embodied intelligence solutions to clients and partners [6]. Group 4: Leadership and Vision - Qunche Intelligent is led by Professor Lu Ce Wu, a prominent figure in the field of embodied intelligence, who possesses both academic depth and industry experience, enabling the company to have full-stack capabilities from technology research and development to commercial delivery [8]. - The company’s core technology is based on force-driven embodied intelligence, breaking through traditional trajectory control frameworks to build a comprehensive autonomous decision-making system that covers perception, cognition, planning, and execution [8].
VLA可以赋于强化学习更智能的场景应用......
具身智能之心· 2025-10-17 04:01
Core Insights - The article discusses the importance of reinforcement learning (RL) in the development of embodied intelligent robots, highlighting its applications in various complex tasks such as stair climbing, running, and dancing [3][9] - It emphasizes the challenges faced by newcomers in the field of reinforcement learning, particularly in producing quality research papers due to the complexity and breadth of the subject [6][10] - To address these challenges, a specialized 1v6 mentoring course in reinforcement learning has been introduced, aimed at helping students produce publishable research papers [7][10] Group 1: Reinforcement Learning Applications - Reinforcement learning is crucial for gait control in humanoid and quadruped robots, enabling them to perform tasks in challenging environments [3][9] - The VLA+RL approach for robotic arms is gaining popularity in academia, enhancing the efficiency and smoothness of robotic operations [4][9] Group 2: Course Structure and Objectives - The 1v6 mentoring course is designed for graduate students and others needing guidance on research papers, featuring weekly live sessions and dedicated teaching assistants [8][10] - The course spans 14 weeks of intensive online training followed by 8 weeks of maintenance support, focusing on various aspects of research paper production, including idea confirmation, project implementation, and writing refinement [10][18] Group 3: Course Content and Deliverables - The curriculum includes topics such as reinforcement learning fundamentals, simulation environments, and writing guidance, with a focus on producing a research paper suitable for top conferences and journals [10][19] - Students will receive structured templates and support for writing and submission processes, ensuring they meet the standards of leading academic publications [10][29] Group 4: Instructor and Support - The course is led by experienced instructors with backgrounds in embodied intelligence and robotics, providing both theoretical knowledge and practical insights [27] - Continuous support is offered through a dedicated WeChat group for real-time Q&A, enhancing the learning experience [18][27]