Workflow
具身智能之心
icon
Search documents
面向真机,搞了一套VLA算法部署+量化+世界模型实战教程
具身智能之心· 2025-12-05 00:02
Core Viewpoint - The article discusses the challenges and advancements in the VLA (Variable Learning Algorithm) field, emphasizing the importance of real machine data for effective model training and deployment, as well as the need for practical learning resources in this rapidly evolving area [2][4][14]. Group 1: Data Collection - Data collection methods in VLA primarily include imitation learning and reinforcement learning, with remote operation, VR, and full-body motion capture being key techniques [8]. - Ensuring high-quality data collection is crucial, and methods like real2sim2real are highlighted as important for effective data utilization [8]. Group 2: VLA Training - Before deploying models on real machines, simulation debugging is essential, especially when real machine data is insufficient, utilizing frameworks like Mujoco and Isaac Gym [10]. - Training techniques are critical, with challenges in fine-tuning models and achieving good results with limited data being common issues faced by learners [10][11]. - Some algorithms, such as ACT, are easier to train, while others like π0 and π0.5 require more intricate techniques and experience [11]. Group 3: VLA Deployment - After training, models often require optimization to reduce their size, as VLA models typically have large parameter counts, posing challenges for deployment on edge devices [13]. - Techniques such as quantization and distillation are necessary to minimize parameters while maintaining performance [13]. Group 4: Educational Resources - The article introduces a practical course aimed at helping learners effectively navigate the complexities of VLA, covering hardware, data collection, algorithms, and deployment [14][16]. - The course is designed for various audiences, including those seeking to enter the field, advance their skills, or transition from related areas like traditional computer vision or robotics [24].
人形机器人新突破!敏捷稳定两不误
具身智能之心· 2025-12-05 00:02
编辑丨 量子位 点击下方 卡片 ,关注" 具身智能之心 "公众号 >> 点击进入→ 具身 智能之心 技术交流群 更多干货,欢迎加入国内首个具身智能全栈学习社区: 具身智能之心知识星球(戳我) ,这里包含所有你想要的! 叶问蹲、跳舞、跑步,一个策略全搞定! 核心思路: AMS从三个关键方面解决动态运动与平衡控制的统一问题: 1. 异构数据源 :从机器人动作空间直接采样生成可扩展的平衡数据,突破人类数据限制,缓解长尾分布问题。 近日,来自香港大学、NVIDIA和清华大学的联合研究团队提出了一种名为 AMS ( Agility Meets Stability ) 的统一人形机器人全身控制框 架, 首次 实现了在单一策略中同时具备动态运动跟踪和极限平衡控制能力。 2. 混合奖励机制 :选择性应用平衡先验奖励,精准平衡指导不牺牲敏捷性,化解优化目标冲突。 3. 自适应学习策略 :动态调整采样概率,同时对每个动作"因材施教",实现高效的自适应学习。 下面来看详细内容。 人形机器人的"两难困境" 人形机器人要在人类环境中执行各种任务,需要同时具备两个看似矛盾的能力: 敏捷的动态运动 和 精确的平衡控制 。 反观人类,却能轻 ...
有的同学已经开始叠毛巾,有的还在调硬件......
具身智能之心· 2025-12-04 09:53
支持pi0和pi0.5部署了~ 最近打通了pi0和pi0.5任务,代码也会正式开源给大家,希望能帮到更多在做具身科研的同学。 一直在想,怎么让大家更直观地感受到这套机械臂的"可用"和"进化"呢? 别担心,Imeta-Y1 来了——这是一款专为新手和科研初学者设计的轻量级高性价比机械臂。 无论你是学生、教育工作者,还是刚踏入机器人领域的开发者,Imeta-Y1 都能帮你低成本、高效率地完成 算法验证与项目开发。 对小白尤其友好的是: ✅ 提供全流程开源工具链+代码示例,从数据采集到模型部署一气呵成; 正好,最近我们拿它做了一件很生活化的事——叠毛巾。从一开始只能完成单次折叠,到后来可以实现连 续、流畅地叠放多条毛巾,这个过程里算法在迭代,机械臂的执行也越来越稳定。 这段视频就放在开头,它不算什么炫技,但或许能看到:这是一个真的能在现实世界里持续工作、持续学 习的系统。 如果你也需要一款能落地算法、并能跟着项目一起成长的机械臂,或许可以花几分钟看看Imeta-Y1。 面向具身科研领域打造的轻量级高性价比机械臂 还在为具身智能领域的硬件选择发愁吗? 太贵的机械臂买不起,太便宜的又难用、难上手? ✅ 支持 Pytho ...
VLA 模型的泛化能力超乎你的想象:换个新相机和视角推理也能轻松搞定!
具身智能之心· 2025-12-04 03:10
点击下方 卡片 ,关注" 具身智能 之心 "公众号 作者丨 Weiqi Li等 编辑丨具身智能之心 本文只做学术分享,如有侵权,联系删文 >> 点击进入→ 具身智能之心 技术交流群 更多干货,欢迎加入国内首个具身智能全栈学习社区 : 具身智能之心知识星球 (戳我) , 这里包含所有你想要的。 VLA模型在分布内任务中表现优异,但在新摄像机视角和视觉扰动下性能急剧下降。研究表明,这种脆弱性主要源于 空间建模 的对齐偏差,而非物理建模问题。 为解决此问题,中山大学等机构研究人员提出了一种单次自适应框架,通过轻量级可学习的参数更新来重新校准视觉表征。首先提出的 特征token调制(FTM) 方 法,对视觉token进行全局仿射变换,仅用4K参数就将Libero数据集的视角准确率从48.5%提升至87.1%。在此基础上, 特征线性自适应(FLA) 方法进一步为ViT编 码器引入低秩更新,以470万参数实现了90.8%的成功率,在远低于LoRA规模微调成本的情况下达到同等效果。这些结果表明,预训练VLA模型中存在大量未被挖 掘的鲁棒性潜力,并且 针对性、极小化的视觉自适应足以恢复模型的视角泛化能力。 VLA模型的泛化性 ...
具身智能之心招募合伙人了~
具身智能之心· 2025-12-04 03:10
Group 1 - The article emphasizes the importance of community support in operating a platform that brings continuous value to the industry [1] - The company invites influential figures in the field to collaborate on various initiatives, including course development, paper guidance, consulting services, corporate training, discipline co-construction, and hardware development [1] Group 2 - The company aims to develop courses that benefit beginners and promote industry advancement, targeting both C-end and corporate training, as well as higher education curriculum development [3] - The goal is to create an affordable and user-friendly research platform for developers and beginners in the field [5] Group 3 - The company offers consulting and training services for both B-end and C-end clients in areas such as embodied data, ontology, algorithms, and deployment, supporting industry upgrades and talent development [7] - The company ensures the protection of personal privacy for individuals currently employed in the industry [7] Group 4 - The company provides competitive compensation within the industry and access to its resources for collaborators [8]
LatBot:中科院团队提出潜在动作蒸馏,提升机器人VLA小样本迁移效率
具身智能之心· 2025-12-04 00:04
本文只做学术分享,如有侵权,联系删文 >> 点击进入→ 具身智能之心 技术交流群 更多干货,欢迎加入国内首个具身智能全栈学习社区 : 具身智能之心知识星球 (戳我) , 这里包含所有你想要的。 点击下方 卡片 ,关注" 具身智能 之心 "公众号 作者丨 Zuolei Li等 编辑丨具身智能之心 一、 研究背景与挑战 潜动作学习是视觉-语言-动作(VLA)模型的重要研究方向,核心是从连续帧中提取压缩的运动语义,形成与机器人实体无关的通用表示,从而利用大规模人类 视频扩展训练数据,突破传统机器人数据集的多样性和泛化性限制。 现有潜动作模型(LAM)存在三大关键问题:一是缺乏任务指令引导,无法捕捉与任务相关的变化;二是对多帧信息利用不足,导致潜动作表示不够精确,难 以捕捉运动动态;三是过度关注视觉外观变化,缺乏物理感知,使得潜动作表示与实际可执行动作之间存在语义鸿沟,严重影响下游任务的迁移效果。 二、 核心方法设计 2.1 解耦的潜动作表示 将潜动作分解为两个互补的可学习token,明确区分机器人主动运动与环境被动变化: 通过引入预训练视觉-语言模型(VLM),结合任务指令和多帧输入,将两个可学习token([CP ...
中国移动以亿元战略投资落子,抢占具身智能触觉“必争之地”
具身智能之心· 2025-12-04 00:04
以下文章来源于高工人形机器人 ,作者Levi. 高工人形机器人 . 高工机器人旗下聚焦具身智能、人形机器人赛道平台,覆盖计算、感知、控制、执行、本体核心供应 链,提供品牌传播策划、行业研究咨询服务,洞察行业变革,挖掘未来之星 想象一下:当你握住一只玻璃杯,会本能地调整力道,既不让它滑落也不至于捏得太紧;而捏起一片薯 片时,则会更加轻柔克制,避免薯片碎成残渣。这种对力度细致入微的掌控,对人类而言不假思索,但 对机器人来说,却是一道棘手的技术难题。 让这一切精准可控的关键,正是触觉 。 这是戴盟机器人两年以来完成的第四次融资,节奏并不算太快,前三次融资分别为: 今年8月,戴盟机器人宣布完成亿元级天使++轮融资,由招商局创投领投,东方嘉富、架桥资本跟投, 本轮融资将助力戴盟加速全球领先的视触觉感知与灵巧操作技术的落地应用,持续引领具身智能技术的 产业化进程; 2024年11月,戴盟机器人连续完成两轮亿元级天使+轮融资,由金鼎资本、国中资本、联想创投以及招 银国际联合投资。本轮融资资金主要用于 光学视触觉 传感器、触觉灵巧手以及含触觉的多模态感知操 作模型等产品与技术研发。 2023年9月,戴盟机器人完成数千万天使轮 ...
为什么给机器人装上昂贵的触觉传感器,反而让它变笨了?
具身智能之心· 2025-12-04 00:04
编辑丨机器之心 点击下方 卡片 ,关注" 具身智能之心 "公众号 >> 点击进入→ 具身 智能之心 技术交流群 更多干货,欢迎加入国内首个具身智能全栈学习社区: 具身智能之心知识星球(戳我) ,这里包含所有你想要的! 这项工作由伊利诺伊大学香槟分校 (UIUC)、哈佛大学、哥伦比亚大学和麻省理工学院 (MIT) 的合作完成 。 论文标题: Multi-Modal Manipulation via Policy Consensus 论文链接: https://arxiv.org/pdf/2509.23468 主页链接: https://policyconsensus.github.io/ 为什么特征拼接 (Feature Concatenation)会在机器人感知和决策中失效? 想象一下,你在黑漆漆的背包里找钥匙。你的眼睛此时毫无用处,全靠指尖的触觉,这对你来说轻而易举 ,但在机器人领域,这却是一个非常困难的问题。 残酷的真相: 目前的机器人学习主流的多传感器融合的算法(Feature Concatenation)在处理这种任务时彻底失败了。我们的实验数据显示,当你给机器人加上触 觉数据试图让它更聪明时,它的抓 ...
浙大系具身智能再闯港交所:主打工业场景,每天进账1000000元
具身智能之心· 2025-12-04 00:04
Core Viewpoint - The article discusses the recent developments of XianGong Intelligent, a company focused on robotic control systems, as it prepares for its IPO on the Hong Kong Stock Exchange. Despite increasing revenues, the company has faced continuous losses and challenges in cash flow management, which may impact its market position and growth potential [2][4][8][66]. Revenue Growth - XianGong Intelligent has shown consistent revenue growth over the past three years, with revenues of 184 million RMB in 2022, 249 million RMB in 2023, and projected 339 million RMB in 2024, reflecting a compound annual growth rate (CAGR) of 35.7% [5][40]. - The company generates nearly 1 million RMB in revenue daily [6]. Financial Performance - Despite revenue growth, XianGong Intelligent has not reached profitability, accumulating losses of 122 million RMB over three years, with losses of 32.26 million RMB in 2022, 47.70 million RMB in 2023, and 42.31 million RMB in 2024 [8][53]. - The gross profit margins have remained relatively stable, with rates of 46.8%, 49.2%, and 45.9% from 2022 to 2024 [45]. Product Offering - The company focuses on providing solutions for industrial applications rather than consumer-facing robots, with a product matrix that includes controllers, software, robots, and accessories [9][12][30]. - The SRC series controllers, developed in-house, serve as the "brain" of the robots, enabling them to operate autonomously [15][16]. Market Position - XianGong Intelligent has established a strong market presence, serving over 1,600 integrators and end customers across more than 35 countries, including notable clients like Philips and Schneider Electric [34][36]. - The company holds a leading position in the global market for robotic controllers, with a market share of 23.6% in 2024 [37]. Challenges - The company faces challenges related to cash flow, with an increasing accounts receivable turnover period, which has extended from 48 days in 2022 to 116 days in 2025 [66]. - High research and development costs, which amounted to 39.3 million RMB in 2022 and are projected to reach 71.3 million RMB in 2024, contribute to ongoing financial losses [57]. Management and Team - The founding team, consisting of experienced professionals from Zhejiang University, has been instrumental in the company's technological advancements and strategic direction [76][78][84].
都在说VLA,很多同学连demo都跑不好......
具身智能之心· 2025-12-03 10:00
Core Viewpoint - The article discusses the challenges and advancements in the field of VLA (Vision-Language Alignment) models, emphasizing the importance of real machine data and practical applications in robotics and embodied intelligence. Group 1: Challenges in VLA Implementation - Many students struggle with the transition from theoretical knowledge to practical application, often finding it difficult to achieve satisfactory results without hands-on experience [2][6] - The reliance on real machine data for effective training and deployment of VLA models is highlighted, with a focus on the limitations of simulation data [2][8] Group 2: Data Collection and Training - Data collection methods for VLA include imitation learning and reinforcement learning, with a particular emphasis on remote operation and VR techniques [8] - The training of VLA models requires careful tuning and optimization, with specific challenges noted for models like π0 and π0.5, which demand a high level of expertise [10][12] Group 3: Deployment and Optimization - Post-training, VLA models often require optimization techniques such as quantization and distillation to reduce parameter size while maintaining performance [12] - The deployment of VLA models on edge devices presents significant challenges due to their typically large parameter sizes [12] Group 4: Educational Initiatives - The article introduces a practical course aimed at helping individuals learn about VLA, covering various aspects such as hardware, data collection, algorithm implementation, and real-world applications [14][30] - The course is designed for a diverse audience, including students and professionals looking to transition into the field of embodied intelligence [27][30]