Workflow
Diffusion Policy
icon
Search documents
具身智能之心技术交流群成立了!
具身智能之心· 2025-11-26 10:00
具身智能之心技术交流群成立了!主要关注VLA、VLN、遥操作、Diffusion Policy、强化学习、VLA+RL、 sim2real、多模态大模型、仿真、运动控制、目标导航、建图定位、导航等方向。 感兴趣的同学可以添加小助理微信AIDriver005,邀请加入我们的社群。 注意哦, 备注:机构/学校+姓名+研究方向 ,能够快速入群! ...
从纯小白到具身算法工程师的打怪之路
具身智能之心· 2025-11-20 04:02
今天有个老学员,拿到了某头部的offer,自笑到从纯小白到算法工程师的打怪之路着实不简单,但真的有 门路。从自己购买so-100折腾,到后面跟着系统的路线一起学习,不仅节省了很多时间,也避免陷入了较 多的坑里。 这里也为大家推荐几个具身方向的研究路线:涉及vla、vln、diffusion policy、强化学习等。也欢迎扫码直 接学习: vla方向 VLA构成的机器人系统主要包括:视觉的感知处理模块,语言指令的理解以及生成机器人可执行动作的策 略网络。根据不同的需求,目前的VLA主要分为三类范式:显示端到到VLA,隐式端到端VLA以及分层端 到端VLA。 显示端到到VLA,是最常见最经典的范式。通常是将视觉语言信息压缩成联合的表征,然后再基于这个表 征去重新映射到动作空间,生成对应的动作。这类端到端的范式依赖于先前广泛的研究先验,通过不同架 构(diffusion/ transformer/dit),不同的模型大小,不同的应用场景(2d/3d),不同的任务需求(从头训/下 游微调),产生了各类不同的方案,取得了不错的性能。 隐式端到端VLA,则不同于前者,更加关注工作的可解释性,旨在利用当前的video d ...
从世界模型到VLA再到强化,具身大小脑算法原来是这样的!
具身智能之心· 2025-10-26 04:02
Core Insights - The article discusses the evolution and current state of embodied intelligence, focusing on the roles of the brain and cerebellum in robotics, where the brain handles perception and planning, while the cerebellum is responsible for execution [3][10]. Technical Evolution - The development of embodied intelligence has progressed through several stages, starting from grasp pose detection, moving to behavior cloning, and now advancing to diffusion policy and VLA models [7][10]. - The first stage focused on static object grasping with limited decision-making capabilities [7]. - The second stage introduced behavior cloning, allowing robots to learn from expert demonstrations but faced challenges in generalization and error accumulation [8]. - The third stage, marked by the introduction of diffusion policy, improved stability and generalization by modeling action sequences [8]. - The fourth stage, emerging in 2025, explores the integration of VLA models with reinforcement learning and world models to enhance robots' predictive and interactive capabilities [9][10]. Current Trends and Applications - The integration of VLA with reinforcement learning enhances robots' trial-and-error learning and self-improvement abilities, while the combination with world models allows for future prediction and better planning [10]. - The article highlights the growing demand for embodied intelligence applications across various sectors, including industrial, home, restaurant, and medical rehabilitation, leading to increased job opportunities and research interest in the field [10]. Educational Initiatives - The article outlines a structured learning program aimed at equipping individuals with comprehensive knowledge of embodied intelligence algorithms, including practical applications and real-world projects [11][14]. - The course targets individuals with a foundational understanding of embodied intelligence and aims to bridge the gap between theoretical knowledge and practical deployment [18][24].
我们正在找具身领域的合伙人......
具身智能之心· 2025-10-08 02:49
Core Viewpoint - The company is seeking collaboration with global practitioners in the embodied intelligence field to enhance capabilities in various areas such as technical services, training, course development, and research guidance [1]. Group 1: Collaboration Opportunities - There is an increasing demand from partners and small companies for the company to empower them through solutions, data collection, technology upgrades, and corporate training [1]. - The company is inviting outstanding partners to join in driving significant industry progress [1]. Group 2: Compensation and Resources - The company will offer high compensation and abundant industry resources to collaborators [2]. Group 3: Focus Areas - Key focus areas for collaboration include but are not limited to: VLA, VLN, Diffusion Policy, Reinforcement Learning, VLA+RL, remote operation, motion capture, sim2real, multimodal large models, simulation, motion control, end-to-end systems, and 3D perception [3]. Group 4: Job Description - The positions are primarily aimed at embodied course development, solution research and development, hardware development, and training collaboration, targeting both B-end (enterprises, universities, research institutes) and C-end (students, job seekers) [4]. Group 5: Contact Information - Interested parties can add WeChat oooops-life for further inquiries [5].
具身的这几个方向,组成了所谓的大小脑算法
具身智能之心· 2025-09-19 00:03
Core Viewpoint - The article discusses the evolution and current trends in embodied intelligence technology, emphasizing the integration of various models and techniques to enhance robotic capabilities in real-world environments [3][10]. Group 1: Technology Development Stages - The development of embodied intelligence has progressed through several stages, starting from grasp pose detection to behavior cloning, and now to diffusion policy and VLA models [7][10]. - The first stage focused on static object grasping with limited decision-making capabilities [7]. - The second stage introduced behavior cloning, allowing robots to learn from expert demonstrations but faced challenges in generalization and error accumulation [7]. - The third stage, marked by the introduction of diffusion policy methods, improved stability and generalization by modeling action sequences [8]. - The fourth stage, beginning in 2025, explores the integration of VLA models with reinforcement learning and world models to enhance predictive capabilities and multi-modal perception [9][10]. Group 2: Key Technologies and Techniques - Key technologies in embodied intelligence include VLA, diffusion policy, and reinforcement learning, which collectively enhance robots' task execution and adaptability [5][10]. - VLA models combine visual perception, language understanding, and action generation, enabling robots to interpret human commands and perform complex tasks [8]. - The integration of tactile sensing with VLA models expands the sensory capabilities of robots, allowing for more precise operations in unstructured environments [10]. Group 3: Industry Implications and Opportunities - The advancements in embodied intelligence are leading to increased demand for engineering and system capabilities, transitioning from theoretical research to practical deployment [10][14]. - There is a growing interest in training and deploying various models, including diffusion policy and VLA, on platforms like Mujoco and IsaacGym [14]. - The industry is witnessing a surge in job opportunities and research interest, prompting many professionals to shift focus towards embodied intelligence [10].
具身智能之心技术交流群成立了!
具身智能之心· 2025-08-28 08:36
Group 1 - The establishment of the Embodied Intelligence Heart Technology Exchange Group focuses on various advanced technologies including VLA, VLN, remote operation, Diffusion Policy, reinforcement learning, VLA+RL, sim2real, multimodal large models, simulation, motion control, target navigation, mapping and localization, and navigation [1] - Interested individuals can add the assistant's WeChat AIDriver005 to join the community [2] - To expedite the group entry process, it is advised to include a note with the institution/school, name, and research direction [3]
具身智能之心B端和C端培训老师招募来啦~
具身智能之心· 2025-08-28 01:20
Group 1 - The article announces the recruitment of teachers for embodied intelligence training, targeting both B-end (business) and C-end (consumer) training services, with compensation above industry standards [1] - The training covers various advanced topics including VLA, VLN, remote operation, Diffusion Policy, reinforcement learning, sim2real, multimodal large models, simulation, motion control, and target navigation [2] - B-end training is aimed at enterprises, universities, and research institutions, while C-end training focuses on students and job seekers, with responsibilities including curriculum design and material preparation [3] Group 2 - Candidates are required to have a doctoral degree or higher (including those currently enrolled), with a preference for those who have published two papers in A-level or Q1 journals/conferences, or have two years of industry experience [3] - Interested individuals can add a specified WeChat contact for further inquiries [4]
从方法范式和应用场景上看强化与VLA/Flow Matching/机器人控制算法
具身智能之心· 2025-08-19 01:54
Core Viewpoint - The article discusses recent advancements in reinforcement learning (RL) and its applications in robotics, particularly focusing on the VLA (Vision-Language Action) models and diffusion policies, highlighting their potential to handle complex tasks that traditional RL struggles with [2][4][35]. Method Paradigms - Traditional RL and imitation learning combined with Sim2Real techniques are foundational approaches in robotics [3]. - VLA models differ fundamentally from traditional RL by using training data distributions to describe task processes and goals, allowing for the execution of more complex tasks [4][35]. - Diffusion Policy is a novel approach that utilizes diffusion models to generate continuous action sequences, demonstrating superior capabilities in complex task execution compared to traditional RL methods [4][5]. Application Scenarios - The article categorizes applications into two main types: basic motion control for humanoid and quadruped robots, and complex/long-range operational tasks [22][23]. - Basic motion control primarily relies on RL and Sim2Real, with current implementations still facing challenges in achieving fluid motion akin to human or animal movements [22]. - For complex tasks, architectures typically involve a pre-trained Vision Transformer (ViT) encoder and a large language model (LLM), utilizing diffusion or flow matching for action output [23][25]. Challenges and Future Directions - The article identifies key challenges in the field, including the need for better simulation environments, effective domain randomization, and the integration of external goal conditions [35]. - It emphasizes the importance of human intention in task definition and the limitations of current models in learning complex tasks without extensive human demonstration data [35][40]. - Future advancements may involve multi-modal input predictions for task goals and the potential integration of brain-machine interfaces to enhance human-robot interaction [35].
具身智能之心技术交流群成立了!
具身智能之心· 2025-08-11 06:01
Group 1 - The establishment of a technical exchange group focused on embodied intelligence technologies, including VLA, VLN, remote operation, Diffusion Policy, reinforcement learning, VLA+RL, sim2real, multimodal large models, simulation, motion control, target navigation, mapping and localization, and navigation [1] - Interested individuals can add the assistant's WeChat AIDriver005 to join the community [2] - To expedite the joining process, it is recommended to include the organization/school, name, and research direction in the remarks [3]
具身智能之心技术交流群成立了!
具身智能之心· 2025-08-07 02:38
Group 1 - The establishment of the Embodied Intelligence Heart Technology Exchange Group focuses on various advanced technologies including VLA, VLN, remote operation, Diffusion Policy, reinforcement learning, VLA+RL, sim2real, multimodal large models, simulation, motion control, target navigation, mapping and localization, and navigation [1] - Interested individuals can add the assistant's WeChat AIDriver005 to join the community [2] - To expedite the joining process, it is recommended to include a note with the institution/school, name, and research direction [3]