具身智能之心
Search documents
VLA 模型的泛化能力超乎你的想象:换个新相机和视角推理也能轻松搞定!
具身智能之心· 2025-12-04 03:10
点击下方 卡片 ,关注" 具身智能 之心 "公众号 作者丨 Weiqi Li等 编辑丨具身智能之心 本文只做学术分享,如有侵权,联系删文 >> 点击进入→ 具身智能之心 技术交流群 更多干货,欢迎加入国内首个具身智能全栈学习社区 : 具身智能之心知识星球 (戳我) , 这里包含所有你想要的。 VLA模型在分布内任务中表现优异,但在新摄像机视角和视觉扰动下性能急剧下降。研究表明,这种脆弱性主要源于 空间建模 的对齐偏差,而非物理建模问题。 为解决此问题,中山大学等机构研究人员提出了一种单次自适应框架,通过轻量级可学习的参数更新来重新校准视觉表征。首先提出的 特征token调制(FTM) 方 法,对视觉token进行全局仿射变换,仅用4K参数就将Libero数据集的视角准确率从48.5%提升至87.1%。在此基础上, 特征线性自适应(FLA) 方法进一步为ViT编 码器引入低秩更新,以470万参数实现了90.8%的成功率,在远低于LoRA规模微调成本的情况下达到同等效果。这些结果表明,预训练VLA模型中存在大量未被挖 掘的鲁棒性潜力,并且 针对性、极小化的视觉自适应足以恢复模型的视角泛化能力。 VLA模型的泛化性 ...
具身智能之心招募合伙人了~
具身智能之心· 2025-12-04 03:10
Group 1 - The article emphasizes the importance of community support in operating a platform that brings continuous value to the industry [1] - The company invites influential figures in the field to collaborate on various initiatives, including course development, paper guidance, consulting services, corporate training, discipline co-construction, and hardware development [1] Group 2 - The company aims to develop courses that benefit beginners and promote industry advancement, targeting both C-end and corporate training, as well as higher education curriculum development [3] - The goal is to create an affordable and user-friendly research platform for developers and beginners in the field [5] Group 3 - The company offers consulting and training services for both B-end and C-end clients in areas such as embodied data, ontology, algorithms, and deployment, supporting industry upgrades and talent development [7] - The company ensures the protection of personal privacy for individuals currently employed in the industry [7] Group 4 - The company provides competitive compensation within the industry and access to its resources for collaborators [8]
LatBot:中科院团队提出潜在动作蒸馏,提升机器人VLA小样本迁移效率
具身智能之心· 2025-12-04 00:04
本文只做学术分享,如有侵权,联系删文 >> 点击进入→ 具身智能之心 技术交流群 更多干货,欢迎加入国内首个具身智能全栈学习社区 : 具身智能之心知识星球 (戳我) , 这里包含所有你想要的。 点击下方 卡片 ,关注" 具身智能 之心 "公众号 作者丨 Zuolei Li等 编辑丨具身智能之心 一、 研究背景与挑战 潜动作学习是视觉-语言-动作(VLA)模型的重要研究方向,核心是从连续帧中提取压缩的运动语义,形成与机器人实体无关的通用表示,从而利用大规模人类 视频扩展训练数据,突破传统机器人数据集的多样性和泛化性限制。 现有潜动作模型(LAM)存在三大关键问题:一是缺乏任务指令引导,无法捕捉与任务相关的变化;二是对多帧信息利用不足,导致潜动作表示不够精确,难 以捕捉运动动态;三是过度关注视觉外观变化,缺乏物理感知,使得潜动作表示与实际可执行动作之间存在语义鸿沟,严重影响下游任务的迁移效果。 二、 核心方法设计 2.1 解耦的潜动作表示 将潜动作分解为两个互补的可学习token,明确区分机器人主动运动与环境被动变化: 通过引入预训练视觉-语言模型(VLM),结合任务指令和多帧输入,将两个可学习token([CP ...
中国移动以亿元战略投资落子,抢占具身智能触觉“必争之地”
具身智能之心· 2025-12-04 00:04
以下文章来源于高工人形机器人 ,作者Levi. 高工人形机器人 . 高工机器人旗下聚焦具身智能、人形机器人赛道平台,覆盖计算、感知、控制、执行、本体核心供应 链,提供品牌传播策划、行业研究咨询服务,洞察行业变革,挖掘未来之星 想象一下:当你握住一只玻璃杯,会本能地调整力道,既不让它滑落也不至于捏得太紧;而捏起一片薯 片时,则会更加轻柔克制,避免薯片碎成残渣。这种对力度细致入微的掌控,对人类而言不假思索,但 对机器人来说,却是一道棘手的技术难题。 让这一切精准可控的关键,正是触觉 。 这是戴盟机器人两年以来完成的第四次融资,节奏并不算太快,前三次融资分别为: 今年8月,戴盟机器人宣布完成亿元级天使++轮融资,由招商局创投领投,东方嘉富、架桥资本跟投, 本轮融资将助力戴盟加速全球领先的视触觉感知与灵巧操作技术的落地应用,持续引领具身智能技术的 产业化进程; 2024年11月,戴盟机器人连续完成两轮亿元级天使+轮融资,由金鼎资本、国中资本、联想创投以及招 银国际联合投资。本轮融资资金主要用于 光学视触觉 传感器、触觉灵巧手以及含触觉的多模态感知操 作模型等产品与技术研发。 2023年9月,戴盟机器人完成数千万天使轮 ...
为什么给机器人装上昂贵的触觉传感器,反而让它变笨了?
具身智能之心· 2025-12-04 00:04
编辑丨机器之心 点击下方 卡片 ,关注" 具身智能之心 "公众号 >> 点击进入→ 具身 智能之心 技术交流群 更多干货,欢迎加入国内首个具身智能全栈学习社区: 具身智能之心知识星球(戳我) ,这里包含所有你想要的! 这项工作由伊利诺伊大学香槟分校 (UIUC)、哈佛大学、哥伦比亚大学和麻省理工学院 (MIT) 的合作完成 。 论文标题: Multi-Modal Manipulation via Policy Consensus 论文链接: https://arxiv.org/pdf/2509.23468 主页链接: https://policyconsensus.github.io/ 为什么特征拼接 (Feature Concatenation)会在机器人感知和决策中失效? 想象一下,你在黑漆漆的背包里找钥匙。你的眼睛此时毫无用处,全靠指尖的触觉,这对你来说轻而易举 ,但在机器人领域,这却是一个非常困难的问题。 残酷的真相: 目前的机器人学习主流的多传感器融合的算法(Feature Concatenation)在处理这种任务时彻底失败了。我们的实验数据显示,当你给机器人加上触 觉数据试图让它更聪明时,它的抓 ...
浙大系具身智能再闯港交所:主打工业场景,每天进账1000000元
具身智能之心· 2025-12-04 00:04
Core Viewpoint - The article discusses the recent developments of XianGong Intelligent, a company focused on robotic control systems, as it prepares for its IPO on the Hong Kong Stock Exchange. Despite increasing revenues, the company has faced continuous losses and challenges in cash flow management, which may impact its market position and growth potential [2][4][8][66]. Revenue Growth - XianGong Intelligent has shown consistent revenue growth over the past three years, with revenues of 184 million RMB in 2022, 249 million RMB in 2023, and projected 339 million RMB in 2024, reflecting a compound annual growth rate (CAGR) of 35.7% [5][40]. - The company generates nearly 1 million RMB in revenue daily [6]. Financial Performance - Despite revenue growth, XianGong Intelligent has not reached profitability, accumulating losses of 122 million RMB over three years, with losses of 32.26 million RMB in 2022, 47.70 million RMB in 2023, and 42.31 million RMB in 2024 [8][53]. - The gross profit margins have remained relatively stable, with rates of 46.8%, 49.2%, and 45.9% from 2022 to 2024 [45]. Product Offering - The company focuses on providing solutions for industrial applications rather than consumer-facing robots, with a product matrix that includes controllers, software, robots, and accessories [9][12][30]. - The SRC series controllers, developed in-house, serve as the "brain" of the robots, enabling them to operate autonomously [15][16]. Market Position - XianGong Intelligent has established a strong market presence, serving over 1,600 integrators and end customers across more than 35 countries, including notable clients like Philips and Schneider Electric [34][36]. - The company holds a leading position in the global market for robotic controllers, with a market share of 23.6% in 2024 [37]. Challenges - The company faces challenges related to cash flow, with an increasing accounts receivable turnover period, which has extended from 48 days in 2022 to 116 days in 2025 [66]. - High research and development costs, which amounted to 39.3 million RMB in 2022 and are projected to reach 71.3 million RMB in 2024, contribute to ongoing financial losses [57]. Management and Team - The founding team, consisting of experienced professionals from Zhejiang University, has been instrumental in the company's technological advancements and strategic direction [76][78][84].
都在说VLA,很多同学连demo都跑不好......
具身智能之心· 2025-12-03 10:00
Core Viewpoint - The article discusses the challenges and advancements in the field of VLA (Vision-Language Alignment) models, emphasizing the importance of real machine data and practical applications in robotics and embodied intelligence. Group 1: Challenges in VLA Implementation - Many students struggle with the transition from theoretical knowledge to practical application, often finding it difficult to achieve satisfactory results without hands-on experience [2][6] - The reliance on real machine data for effective training and deployment of VLA models is highlighted, with a focus on the limitations of simulation data [2][8] Group 2: Data Collection and Training - Data collection methods for VLA include imitation learning and reinforcement learning, with a particular emphasis on remote operation and VR techniques [8] - The training of VLA models requires careful tuning and optimization, with specific challenges noted for models like π0 and π0.5, which demand a high level of expertise [10][12] Group 3: Deployment and Optimization - Post-training, VLA models often require optimization techniques such as quantization and distillation to reduce parameter size while maintaining performance [12] - The deployment of VLA models on edge devices presents significant challenges due to their typically large parameter sizes [12] Group 4: Educational Initiatives - The article introduces a practical course aimed at helping individuals learn about VLA, covering various aspects such as hardware, data collection, algorithm implementation, and real-world applications [14][30] - The course is designed for a diverse audience, including students and professionals looking to transition into the field of embodied intelligence [27][30]
今年9家盈利最高的人形机器人公司......
具身智能之心· 2025-12-03 03:47
Core Insights - The article provides an overview of the order amounts and shipment volumes of leading robotics companies projected for 2025, highlighting the top nine companies in terms of revenue and their core clients [1]. Group 1: Company Orders and Shipments - Zhongqing Robotics: Order amount of 200 million yuan over three years, with a shipment volume of 2,000 units, collaborating with major firms like Multi-Tech and Amazon [2]. - Songyan Power: Order amount exceeding 100 million yuan, with annual shipments surpassing 2,500 units, focusing on education, research, and commercial performances [2]. - Stardust Intelligence: Approximately 500 million yuan in orders, with plans to deploy over 1,000 AI robots in industrial manufacturing and logistics over the next two years [2]. - Zhifang Technology: Order amount of 500 million yuan, with over 1,000 units to be delivered in three years, primarily for industrial applications [2]. - Leju Robotics: Order amount around 500 million yuan, with nearly 2,000 units shipped annually [2]. - Zhiyuan Robotics: Order amount approximately 700 million yuan, with thousands of units shipped, serving various industrial applications [2]. Group 2: Major Players and Their Orders - UBTECH Technology: Order amount exceeding 800 million yuan, with around 2,700 units shipped, primarily serving automotive manufacturers and data collection needs [3]. - Yuejiang Robotics: Order amount around 1.1 billion yuan, with annual shipments of about 20,000 units, projecting 80,000 units in 2024 and 100,000 units in 2025 [3]. - Yushu Technology: Order amount close to 1.2 billion yuan, with over 10,000 units shipped, collaborating with various educational institutions and companies for research and development [3].
五年,终于等来Transformers v5
具身智能之心· 2025-12-03 03:47
编辑丨 机器之心 点击下方 卡片 ,关注" 具身智能之心 "公众号 >> 点击进入→ 具身 智能之心 技术交流群 更多干货,欢迎加入国内首个具身智能全栈学习社区: 具身智能之心知识星球(戳我) ,这里包含所有你想要的! 刚刚,Transformers v5 发布首个 RC(候选) 版本 v5.0.0rc0。 GitHub:https://github.com/huggingface/transformers/releases/tag/v5.0.0rc0 这次更新标志着这一全球最流行的 AI 基础设施库,正式跨越了从 v4 到 v5 长达 五年 的技术周期。 作为 Hugging Face 最核心的开源项目,自 2020 年 11 月 v4 版本发布以来,Transformers 的日下载量已从当时的 2 万次激增至如今的超过 300 万次 ,总安装量突破 12 亿次 。 它定义了业界如何使用模型,支持的架构也从最初的 40 个扩展至超过 400 个 ,涵盖了文本、视觉、音频及多模态领域,社区贡献的模型权重更是超过 75 万个 , 涵盖了文本、视觉、音频及多模态领域。 官方表示,在人工智能领域,「重塑」是保持长 ...
免训练!使用贝叶斯去微调VLM,机器人操作任务取得SOTA!
具身智能之心· 2025-12-03 03:47
点击下方 卡片 ,关注" 具身智能 之心 "公众号 >>直播和内容获取转到 → 具身智能之心知识星球 点击按钮预约直播 视觉语言模型(VLM)的最新进展显著提升了在具身任务(如目标分解与视觉理解)中的性能。然而,在不对VLM进行微调的情况下,为机器人操作任 务提供精确的奖励仍颇具挑战。这主要是因为预训练数据集中缺乏领域特定的机器人知识,且高昂的计算成本阻碍了其实时应用。为此,研究人员提出 T²-VLM ——一种新颖的、无需训练且具有时序一致性的框架,通过跟踪VLM推导出的子目标状态变化来生成精确的奖励。 本工作首先在每轮交互前查询VLM,以建立空间感知的子目标及初始完成度估计。随后,采用贝叶斯跟踪算法,利用子目标隐藏状态动态更新目标完成 状态,从而为强化学习(RL)智能体生成结构化的奖励。该方法增强了长程决策能力,并借助RL提升了故障恢复性能。大量实验表明, T²-VLM 在两个 机器人操作基准测试中取得了最先进的性能,在降低计算消耗的同时展现了优异的奖励准确性。 我们相信,该方法不仅推动了奖励生成技术的发展,也 为具身人工智能的更广泛领域做出了贡献。 直播时间: 12.3 / 19:30-20:30 直播简 ...