具身智能之心
Search documents
56倍加速生成式策略:EfficientFlow,迈向高效具身智能
具身智能之心· 2025-12-17 00:05
点击下方 卡片 ,关注" 具身智能之心 "公众号 >> 点击进入→ 具身 智能之心 技术交流群 更多干货,欢迎加入国内首个具身智能全栈学习社区: 具身智能之心知识星球(戳我) ,这里包含所有你想要的! 本文共同第一作者为西安交通大学硕士生常建磊和博士生梅若风。柯炜为西安交通大学副教授。论文通讯作者为西安交通大学教授许翔宇,其研究方向涵盖三维 视觉、生成式 AI 与具身智能(个人主页:https://xuxy09.github.io/)。 生成式模型正在成为机器人和具身智能领域的重要范式,它能够从高维视觉观测中直接生成复杂、灵活的动作策略,在操作、抓取等任务中表现亮眼。但在真实 系统中,这类方法仍面临两大「硬伤」: 一是训练极度依赖大规模演示数据,二是推理阶段需要大量迭代,动作生成太慢,难以实时控制。 针对这一核心瓶颈,西安交通大学研究团队提出了全新的生成式策略学习方法 EfficientFlow 。该方法通过将 等变建模与高效流匹配(Flow Matching)深度融合 , 在显著提升数据效率的同时,大幅压缩推理所需的迭代步数 ,在多个机器人操作基准上实现了 SOTA 的性能,并将推理速度提升一个数量级以上。 ...
最近收到了很多同学关于具身方向选择的咨询......
具身智能之心· 2025-12-17 00:05
【具身智能之心论文辅导重磅上线!多模态大模型/VLA/强化学习/VLN/遥操作/数采/机器人仿 真/real2sim2real/端到端/diffusion等顶会方向1V1定制化辅导】 辅导区间 CCF-A到CCF-C 先看看具身的一些方向,vln、vla、强化、还有一些real2sim2real。很多小白不知道如何下手,选择强化学 习还是vla?传统slam还是vln?哪些方向需要较大算力,哪些不需要?除此之外,什么样的本体适合自己研 究,预算不够怎么办?仿真可以吗? 对正在从事slam的同学,vln和vla都是一个比较好的切入方向。如果有机械臂,展开vla是一个不错的选择。 除此之外,没有硬件的同学可以尽量在仿真里面或者使用低成本的so-100等硬件完成实验。也有很多低成 本的科研平台,比如移动操作平台。四足和人形更适合强化,vla难度过高。 剩下就是一些方法论的问题了,有好的idea至关重要。对很多新人研究者,一个好的idea需要踩很多次坑。 如果你还是新人,不知道怎么入门,可以看看我们推出的论文辅导。 论文辅导上线了 最近收到很多小伙伴的咨询,其中不乏大模型、传统机器人、机械方向的同学。 ✅ 顶会/顶刊 ...
具身的半壁江山都在VLA了......
具身智能之心· 2025-12-16 09:25
Core Viewpoint - The article emphasizes the increasing demand for VLA (Variable Learning Algorithm) in the industry, highlighting the challenges associated with data collection and model training, and the need for practical learning resources in this field [1][2][3]. Group 1: VLA Demand and Challenges - There is a significant demand for VLA algorithms in job postings, indicating a growing interest in this technology [1]. - Many practitioners express frustration with the difficulties in tuning VLA algorithms and the complexities of data collection [2]. - The reliance on real machine data for effective VLA model training poses challenges, as many companies struggle with the quality of the collected data [3]. Group 2: VLA Implementation Modules - The implementation of VLA involves several key modules, including data collection methods based on imitation learning and reinforcement learning [8]. - Training VLA models typically requires simulation debugging, especially when real machine data is insufficient, making simulation frameworks like Mujoco and Isaac Gym crucial [9]. - After training, VLA models often require optimization techniques such as quantization and distillation to reduce model size while maintaining performance [10]. Group 3: Educational Resources and Courses - The article introduces a practical course aimed at helping individuals learn VLA effectively, addressing the rapid updates in technology and the challenges faced by learners [11]. - The course covers a comprehensive curriculum, including mechanical arm hardware, data collection, VLA algorithms, evaluation, simulation, and deployment [16][17]. - Participants will receive hands-on experience with real hardware, enhancing their learning and practical skills in the VLA domain [28].
NBA球星,成为英伟达副总裁
具身智能之心· 2025-12-16 00:02
编辑丨 新智元 点击下方 卡片 ,关注" 具身智能之心 "公众号 >> 点击进入→ 具身 智能之心 技术交流群 更多干货,欢迎加入国内首个具身智能全栈学习社区: 具身智能之心知识星球(戳我) ,这里包含所有你想要的! 【导读】 一家市值世界第一的5万亿美元公司,CEO亲自带36位高管,同时不安排固定一对一,敢这样管事的人不多。英伟达的一份内部名单显 示,黄仁勋的直管团队从去年的55人缩至36人,这背后是信息直达与效率极限的博弈。本文用一张「组织透视镜」,带你看清这36人的角色分工、 黄仁勋的管理逻辑,以及它对AI时代公司的启发。 当身高近两米的前NBA球星霍华德·赖特(Howard Wright)推开英伟达的会议室门,他不再是篮下护框者,而是黄仁勋麾下扶持全球1.9万家初创的 Inception负责人——同事们戏称的「最强壮的投资人」。 从球场到高通、英特尔、AWS,再到英伟达,这条跨界轨迹正是这家公司高管群像的缩影:出身各异,却被拉上同一条信息高速路,直接连到CEO。 在英伟达,这条高速路有一个激进的设置:黄仁勋以扁平化直管36位高管,鼎盛时甚至多达55位,规模远超硅谷常规。 黄仁勋 笃信「信息即权力」,每 ...
UniBYD:超越人类示教模仿的跨实体机器人操作学习统一框架
具身智能之心· 2025-12-16 00:02
点击下方 卡片 ,关注" 具身智能 之心 "公众号 更多干货,欢迎加入国内首个具身智能全栈学习社区 : 具身智能之心知识星球 (戳我) , 这里包含所有你想要的。 研究背景与核心问题 在嵌入式智能领域,从人类演示中学习机器人操作是主流范式,但人类手部与不同形态机器人手(如2指、3指、5指)之间的 形态差异鸿沟 ,成为技术落地的核心 障碍: UniBYD核心目标是构建一种学习范式:突破单纯的人类动作模仿,让机器人自主发现与自身物理特性匹配的操作策略,实现跨形态机器人手的高效泛化。 核心创新:UniBYD框架设计 UniBYD是一套统一的强化学习框架,通过 统一形态表示、动态强化学习机制、精细模仿引导 三大核心组件,实现从模仿到探索的平滑过渡,最终学到适配机器人 形态的操作策略(figure2)。 作者丨 Tingyu Yuan等 编辑丨具身智能之心 本文只做学术分享,如有侵权,联系删文 >> 点击进入→ 具身智能之心 技术交流群 统一形态表示(UMR):跨形态建模的基础 为解决不同机器人手形态(自由度、手指数量、刚体数量)的建模差异,UMR将动态状态与静态属性统一为固定维度表示: 动态状态处理 :手腕状态固定为 ...
许华哲,抓紧时间慢慢等具身的未来......
具身智能之心· 2025-12-16 00:02
作者丨 许华哲 编辑丨具身智能之心 本文已经得到许华哲博士的授权,未经允许,不得二次转载。 点击下方 卡片 ,关注" 具身智能之心 "公众号 >> 点击进入→ 具身 智能之心 技术交流群 昨天看到了许华哲老师在社交媒体上的分享,关于数据、量产、本体和场景。类似的观点,今年IROS圆桌期间,许博也站在智能第一性原理上,将具身的未来发展 方向划分为欲望、先验和经验三个模块。 欲望。 在做智能体的时候,无论是物理的还是虚拟的,总觉得现在机器学习没有自己的学习欲望。我们可以设想一下,能不能给机器人一种自己的欲望? 经验。 经验是完成世界最终闭环的一种手段。有一天,在家里面看到一位维修师傅就是帮我们修煤气灶,他踩在一个梯子上拧一个东西,整个身体造型极为扭曲, 但他仍可以完美控制重心保持平衡,并且手上还可以做非常精细的操作。 ★ 这种思想也贯穿在后续的研发和学术探索上。 回想起几年前,我们还在讨论机器人什么时候能全地形走路,后来发现这个话题变成了"跑酷"、"跳舞"、"篮球"。这个变化速率让我知道这个事儿已经成了,如果 明年可以攀岩我并不吃惊。 但这极快的变化速率又显得格外不协调,因为我没在任何地方看到人形机器人真正服务人 ...
新国大团队首创!当VLA具备4D感知能力后会怎么样?
具身智能之心· 2025-12-15 03:17
点击下方 卡片 ,关注" 具身智能 之心 "公众号 >>直播和内容获取转到 → 具身智能之心知识星球 点击按钮预约直播 视觉-语言-动作(VLA)模型在通用机器人任务中展现出应用潜力,但在需要细粒度表征的 时空一致机器人操作 任务中仍面临诸多挑战。现有方法通常会将三 维位置信息嵌入视觉表征,以此提升动作的空间精度,然而这类方法难以实现对动作执行过程的时序一致性控制。 VLA-4D 是 一款具备4D感知能力的通用VLA模型,专门用于实现时空一致的机器人操作。 该模型的设计核心包含两大关键模块:其一为 4D感知视觉表征 ,先 提取视觉特征,再将一维时间信息嵌入三维位置信息以生成4D嵌入特征,随后通过交叉注意力机制将其融合为统一的视觉表征;其二为 时空动作表征 ,VLA- 4D为传统的空间动作表征拓展了时序信息维度,从而支持时空层面的动作规划,并将多模态表征与大语言模型(LLM)进行对齐,以完成时空动作预测。 在这一统一框架下,经特殊设计的视觉表征与动作表征可协同作用,让机器人操作既具备空间流畅性,又能保证时序一致性。此外,本工作还为现有VLA数据集 补充了时序动作标注,用于模型的微调训练。 论文标题 : VLA- ...
看一次就能执行!单视频示范零样本学习&跨模态动作知识迁移
具身智能之心· 2025-12-15 01:04
Core Insights - The article discusses the ViVLA framework, which enables robots to learn new skills from single video demonstrations, addressing the limitations of existing Vision-Language-Action (VLA) models in generalizing to tasks outside their training distribution [1][2][25]. Group 1: Challenges in Robot Skill Generalization - Four core challenges hinder the generalization of robot skills: insufficient fine-grained action recognition, differences in action representation and modalities, inherent flaws in autoregressive modeling, and a lack of diverse expert-agent pairing data [4][5][7]. Group 2: ViVLA's Technical Framework - ViVLA employs a three-layer technical system: unified action space construction, parallel decoding optimization, and large-scale data generation, facilitating efficient learning from single expert demonstration videos [1][8]. - The first layer focuses on latent action learning through an Action-Centric Cycle-Consistency (A3C) framework to bridge the gap between different expert and agent action spaces [10]. - The second layer enhances model training efficiency with parallel decoding and spatiotemporal masking strategies, improving video understanding and action prediction [11][12]. Group 3: Data Generation and Validation - ViVLA's data generation pipeline converts human videos into high-quality paired data, resulting in a dataset of over 892,911 expert-agent training samples [13][17]. - The framework's effectiveness is validated through a three-tier performance verification system, demonstrating significant improvements in unseen task success rates compared to baseline models [14][16]. Group 4: Performance Metrics - In the LIBERO benchmark test, ViVLA achieved over a 30% performance increase in unseen tasks compared to baseline models, with success rates of 74% in real-world manipulation tasks, significantly outperforming other models [14][16][18]. - The model maintained a success rate of over 70% in varying environmental conditions, showcasing its robustness [20]. Group 5: Future Directions and Limitations - While ViVLA represents a breakthrough in single-sample video imitation learning, there are areas for optimization, including enhancing error recovery capabilities and expanding data diversity through automated filtering of human videos [25][27].
面向「空天具身智能」,北航团队提出星座规划新基准丨NeurIPS'25
具身智能之心· 2025-12-15 01:04
编辑丨量子位 点击下方 卡片 ,关注" 具身智能之心 "公众号 >> 点击进入→ 具身 智能之心 技术交流群 这些运行在距地数百公里的卫星星座,正默默支撑着遥感、通信、导航、气象预测等关键行业。但每一个稳定运行的星座背后,都藏着一 个高维、动态、强约束的规划难题。 如何在短短几分钟的观测窗口内,调度数十颗卫星形成协同观测网络,执行上百项任务,同时响应 地震救援、海上搜救、森林火灾等突发需求? 人工智能技术正在成为破解这一难题的关键钥匙。北航刘偲教授团队提出 首个大规模真实 星座调度基准AEOS-Bench ,更创新性地将Transformer模型的泛化能力与航天工程的专业需求深度融合,训练 内嵌时间约束的调度模 型AEOS-Former 。这一组合为未来的"AI星座规划"奠定了新的技术基准。 该研究目前已发表于NeurIPS 2025。 更多干货,欢迎加入国内首个具身智能全栈学习社区: 具身智能之心知识星球(戳我) ,这里包含所有你想要的! 将卫星星座送入轨道我们都知道很难,但高效规划调度在轨卫星星座执行任务也不简单。 随着部署的星座规模越来越大,通过人力进行任务规划的效率已经赶不上卫星的任务执行效率,于是研 ...
Q4融资超过亿元的具身公司.......
具身智能之心· 2025-12-15 01:04
Core Insights - The article provides an overview of the financing situation for embodied robotics companies, highlighting investments over 100 million yuan across various funding rounds from angel to Series C [1]. Company Summaries - **AI² Robotics**: Secured hundreds of millions in funding, focusing on AGI-native general intelligent robots, with applications in semiconductor, automotive, electronics, biotechnology, and public services [4]. - **Self-Variable Robotics**: Raised 1 billion yuan, specializing in AI and robotics technology innovation, building general intelligent agents based on large robot models [5]. - **Xingyuan Intelligent Robotics**: Received 300 million yuan, developing a general embodied brain technology aimed at creating a universal brain for physical world interaction [6]. - **Micro Differential Intelligence**: Funded 100 million yuan, focusing on aerial robotics and intelligent systems for industrial and urban applications [7]. - **Dyna Robotics**: Raised 120 million yuan, dedicated to AI-driven robotics for various tasks, emphasizing cost-effective learning in real production scenarios [8]. - **Motorevo**: Secured 100 million yuan, specializing in robotic joints and power units for various robotic applications [9][18]. - **Lexiang Technology**: Funded 200 million yuan, focusing on general household robotics in the AI era [10]. - **Qianjue Robotics**: Raised 100 million yuan, developing high-dimensional multi-modal tactile perception technology for robotics [11]. - **Leju Robotics**: Secured 1.5 billion yuan, focusing on humanoid robot commercialization and technology accumulation [12]. - **Lingxin Qiaoshou**: Received hundreds of millions, developing a platform centered on dexterous hands and cloud intelligence [13]. - **Songyan Power**: Funded 300 million yuan, focusing on humanoid robot development and manufacturing [14]. - **Wubai Intelligent**: Raised 500 million yuan, a state-owned enterprise focusing on bionic intelligence and robotics [15]. - **Shengshi Weisheng**: Secured 100 million yuan, developing intelligent robots for manufacturing automation [16]. - **Zhongke Optoelectronics**: Funded 215 million yuan, focusing on high-end intelligent robot products for military and manufacturing sectors [17]. - **Deepwood Intelligent**: Raised 200 million yuan, specializing in general embodied intelligent robotics [19]. - **Wujie Power**: Secured 300 million yuan, focusing on building a "universal brain" for robotics [20]. - **Yuanli Lingji**: Funded hundreds of millions, focusing on industrial and logistics automation solutions [21]. - **Accelerated Evolution**: Raised 100 million yuan, developing humanoid robots with advanced motion capabilities [22]. - **Stardust Intelligent**: Secured hundreds of millions, focusing on commercial humanoid robots with strong operational performance [23]. - **Guanglun Intelligent**: Developing solutions for robotics using high-quality simulation and physical AI technology [24]. - **New Era Intelligent**: Raised 100 million yuan, focusing on commercial cleaning robots [25]. - **Star Motion Era**: Secured over 1 billion yuan, focusing on general humanoid robotics technology [26]. - **Aoyi Technology**: Funded 160 million yuan, specializing in non-invasive brain-machine interfaces and rehabilitation robotics [27]. - **Daimeng Robotics**: Raised 100 million yuan, focusing on multi-modal tactile perception and wearable remote operation systems [28]. - **Luming Robotics**: Secured hundreds of millions, focusing on family-oriented intelligent robotics [29]. - **UniX AI**: Funded 300 million yuan, specializing in AI and humanoid robotics technology [30]. - **Ling Sheng Technology**: Raised 100 million yuan, focusing on integrated systems for humanoid and embodied intelligent robotics [31]. - **Cloud Deep Technology**: Funded 500 million yuan, specializing in quadruped robot development [32].