Workflow
VLA模型
icon
Search documents
效率提升25%,灵巧操作数采困境被「臂-手共享自主框架」解决
机器之心· 2025-12-11 10:00
实现通用机器人的类人灵巧操作能力,是机器人学领域长期以来的核心挑战之一。近年来,视觉 - 语言 - 动作 (Vision-Language-Action,VLA) 模型在机器人技能学 习方面展现出显著潜力,但其发展受制于一个根本性瓶颈: 高质量操作数据的获取。 ByteDance Seed 团队最新的研究论文《End-to-End Dexterous Arm-Hand VLA Policies via Shared Autonomy》[1],针对这一关键问题提出了解决方案。 该研究的核心贡献在于提出了共享自主 (Shared Autonomy) 框架,通过合理划分人类操作员与自主 AI 系统的控制职责——人通过 VR 遥操作控制机械臂 (负责高层 定位和避障),DexGrasp-VLA 自主控制灵巧手 (负责精细抓握),消除了同时遥操作臂和灵巧手的需求,大幅降低操作员认知负荷,有效解决了机器人部署中最关 键的数据采集成本问题。通过将数据采集效率提升至可规模化的水平,它为灵巧操作技术从实验室走向工业应用奠定了基础。 Data collection and training pipeline for DexGra ...
AD智驾的2025年:监管刹车、技术狂飙,“地大华魔”四雄争霸
3 6 Ke· 2025-12-11 09:55
图源:长城汽车 今天,我们站在2025年的尾巴上回顾这一年的变化,可以看到那些曾经靠"零接管"神话圈粉的造车新势力,不得不把PPT上的科幻片改成纪实纪录片;那些 曾在算力军备竞赛中狂奔的供应链,也开始学会在边界内深耕。 浮华渐褪,本真显现。智能驾驶虚假宣传被严打后,如今行业怎么样了? 官方祛魅后,智驾技术进步更快了 要理解这场智驾减速大戏,我们需要把时间线调到2025年春天。 4月16日,工信部装备工业一司召开智能网联汽车准入管理推进会,一纸禁令掐住了行业多年的吹牛习惯。会议要求"不得进行夸大和虚假宣传,严格履行告 知义务",并且需要将"组合驾驶辅助"定为官方表述。 2025年即将结束,这一年关于汽车行业有很多关键词,其中之一就是"自动驾驶踩刹车"。 今年春天,工信部一纸公文,将"自动驾驶"列为禁词,车企们宣传的"L2.999"文字游戏被戳破,中国智能驾驶产业被迫从一场持续三年的技术狂欢中清醒过 来。安全最终压倒了速度,责任取代了噱头。 变化是立竿见影的。首先是车企宣传话术风格的剧变。电车通随机查找了国内主流车企的官网,发现"自动驾驶"一词出现频率大幅下降,基本已经消失,取 而代之的是"辅助驾驶""智驾辅助" ...
只用SO-100可以完成π0和π0.5的效果吗?
具身智能之心· 2025-12-11 09:33
Core Viewpoint - The article discusses the challenges and complexities faced by beginners in implementing VLA (Vision-Language Alignment) models, emphasizing the need for practical experience and effective training methods to achieve successful deployment in real-world applications [2][4]. Group 1: Challenges in VLA Implementation - Many students report difficulties in achieving effective results with open-source models like GR00T and PI0, despite low training loss in simulations [2][4]. - The transition from simulation to real-world application (sim2real) poses significant challenges, particularly in data collection and model training [6][7]. - Beginners often struggle with the intricacies of data collection, model training, and deployment, leading to frustration and lack of progress [4][10]. Group 2: VLA Model Components - Data collection methods for VLA primarily include imitation learning and reinforcement learning, with a focus on high-quality data acquisition [6]. - Training VLA models typically requires simulation debugging and fine-tuning, especially when real-world data is limited [7]. - Deployment of VLA models necessitates optimization techniques such as model compression to ensure efficient performance on edge devices [9]. Group 3: Educational Initiatives - The article introduces a practical course aimed at helping students effectively learn VLA, covering various aspects such as hardware, data collection, algorithms, and real-world experiments [10][12]. - The course is designed for individuals seeking to enter the field of embodied intelligence, providing hands-on experience and project support [22][25]. - The course will commence on December 30, 2025, and includes a comprehensive curriculum to enhance participants' skills in VLA [23][26].
智能体将取代APP和SaaS,张亚勤院士发布这些AI洞见
Di Yi Cai Jing· 2025-12-10 05:56
10年以后的机器人比人还要多。 "10年以后的机器人比人还要多,未来的Saas和APP都会被智能体取代……"12月10日,清华大学智能产业研究院院长、中国工程院外籍院士张 亚勤在Meet2026智能未来大会上,一口气谈了他对于人工智能未来的多个趋势性洞见。 AI正在从信息世界走向物理世界和生物世界。他将这个过程描述为从大语言模型走向VLA(视觉-语言-动作)模型——不仅要理解文字和图 像,还要在真实世界中行动。其中无人驾驶在今年已到拐点,预计到2030年,约10%的新车将具备无人驾驶能力,那将是自动驾驶 的"DeepSeek时刻"。 机器人是张亚勤眼中"未来最大的赛道"。尽管人形机器人成熟尚需时日,但他认为十年内机器人的数量或将超过人类。但他同时也提醒,AI能 力的快速提升也伴随着风险的急剧增加。 基于对技术架构的前瞻,张亚勤展示了他绘制的演进图。在ChatGPT问世不久后他构想的架构中,基础大模型作为平台,之上支撑着各垂直领 域模型、SaaS服务层,最上层是各类应用APP。而在今年10月的更新中,他明确提出,未来的SaaS服务和终端APP都将被智能体所取代——智 能体即未来的软件与服务形态。这些智能体将涵盖 ...
VLA 模型的泛化能力超乎你的想象:换个新相机和视角推理也能轻松搞定!
具身智能之心· 2025-12-04 03:10
点击下方 卡片 ,关注" 具身智能 之心 "公众号 作者丨 Weiqi Li等 编辑丨具身智能之心 本文只做学术分享,如有侵权,联系删文 >> 点击进入→ 具身智能之心 技术交流群 更多干货,欢迎加入国内首个具身智能全栈学习社区 : 具身智能之心知识星球 (戳我) , 这里包含所有你想要的。 VLA模型在分布内任务中表现优异,但在新摄像机视角和视觉扰动下性能急剧下降。研究表明,这种脆弱性主要源于 空间建模 的对齐偏差,而非物理建模问题。 为解决此问题,中山大学等机构研究人员提出了一种单次自适应框架,通过轻量级可学习的参数更新来重新校准视觉表征。首先提出的 特征token调制(FTM) 方 法,对视觉token进行全局仿射变换,仅用4K参数就将Libero数据集的视角准确率从48.5%提升至87.1%。在此基础上, 特征线性自适应(FLA) 方法进一步为ViT编 码器引入低秩更新,以470万参数实现了90.8%的成功率,在远低于LoRA规模微调成本的情况下达到同等效果。这些结果表明,预训练VLA模型中存在大量未被挖 掘的鲁棒性潜力,并且 针对性、极小化的视觉自适应足以恢复模型的视角泛化能力。 VLA模型的泛化性 ...
2025商用具身智能白皮书:智启商业未来,身赋无限可能
Ai Rui Zi Xun· 2025-12-04 02:46
Investment Rating - The report does not explicitly state an investment rating for the industry Core Insights - Embodied intelligence is recognized as a significant direction in artificial intelligence, essential for achieving artificial general intelligence, characterized by strong interaction with the environment and continuous learning [5][12] - The global market for embodied intelligence is projected to reach 19.2 billion RMB by 2025, with a compound annual growth rate (CAGR) of 73% over the next five years, indicating a potential trillion-level market in about ten years [84][86] - The development of embodied intelligence is seen as a critical battleground in the technological competition between China and the United States, with implications for economic benefits and national competitiveness [12][15] Summary by Sections 1. Definition and Strategic Significance - Embodied intelligence integrates machine learning, computer vision, and robotics, marking a significant step towards practical AI applications [5] - It is defined as an intelligent system that interacts with the environment through physical bodies, enabling perception, understanding, decision-making, and action [6] 2. Current Development Stages and Key Challenges - The evolution of embodied intelligence is categorized into three phases: conceptual emergence (1950-2000), technological accumulation (2000-2020), and application expansion driven by large models (2020-present) [24][26] - Key challenges include data collection, technology maturity, cost of core components, and societal acceptance [28][29] 3. Global Market Trends - The market for embodied intelligence is transitioning from L2 to L3 levels of autonomy, with expectations of significant advancements in the next 2-3 years [52] - The commercial breakthrough will depend on improvements in reliability, economic efficiency, accuracy, endurance, and latency [55] 4. Industry Value Chain and Market Forecast - The industry value chain is complex, involving hardware, brain, and integration components, with significant potential for Chinese companies in downstream applications [75] - The report highlights a surge in financing for embodied intelligence companies, indicating strong investor interest and market potential [79] 5. Competitive Landscape and Key Success Factors - The competition is intensifying between Chinese and American firms, with both sides leveraging unique strengths in technology and policy support [24][25] - The report emphasizes the importance of collaboration across the industry to overcome existing bottlenecks and achieve large-scale commercialization [28][29] 6. Case Studies of Leading Companies - The report does not provide specific case studies of leading companies in the industry
理想汽车自研AI推理芯片M100明年上车
Sou Hu Cai Jing· 2025-11-27 01:31
Core Insights - Li Auto reported a total revenue of 27.4 billion yuan for Q3 2025, a year-on-year decline of 36.2%, and a net loss of 624.4 million yuan compared to a net profit of 2.8 billion yuan in the same period last year [1] Financial Performance - Total revenue for Q3 2025 was 27.4 billion yuan, down 36.2% year-on-year [1] - The company incurred a net loss of 624.4 million yuan, contrasting with a net profit of 2.8 billion yuan in the previous year [1] Technological Developments - The self-developed AI inference chip M100 is currently in large-scale system testing, with commercialization expected to start next year [3] - The M100 chip, when integrated into the next-generation VLA autonomous driving system, is anticipated to offer a cost-performance ratio exceeding three times that of current high-end chips [3] - The company aims to transition vehicles from "passive tools" to "active service providers" by 2026 with the M100 chip [3] Product Innovations - The VLA model will continue to undergo iterations, with OTA 8.0 focusing on safety experience optimization and OTA 8.1 set to enhance perception capabilities [4] - Future innovations include the industry's first defensive automatic emergency braking (AEB) feature and a full-scene parking function [4] - The VLA model's capabilities have been validated through over 312 million kilometers of actual driving data [4] Chip Development Strategy - Li Auto is concurrently developing two types of chips: an AI inference chip for autonomous driving and a SiC power chip for motor control [4] - The AI inference chip architecture is similar to Tesla's Hardware 5.0, featuring approximately 40 billion transistors, and is expected to enter mass production in 2026 [4]
华为又投了一家具身智能机器人领域创企
Robot猎场备忘录· 2025-11-24 05:21
正文: 梅开四度, 国内领先通用具身智能企业[极佳视界]完成亿元级A1轮融资! 近日,Physical AI(物理AI)领域头部创企 [极佳视界 ]宣布完成 新一轮亿元级A1轮融资,本轮融资由华为哈 勃、华控基金联合投资 。 值的注意的是,公司于8月28日刚完成Pre-A、Pre-A+两轮数亿元融资,其中 Pre-A轮融资由国中资本领投,紫峰 资本、老股东 PKSHA Algorithm Fund跟投;Pre-A+轮融资由中金资本、广州产投、一村淞灵、华强资本投资; 以及于今年2月份完成由 普超资本、合鼎共资本、上海天使会投资联合投资的 数千万天使++轮融资。 温馨提示 : 点击下方图片,查看运营团队最新原创报告(共235页) 说明: 欢迎约稿、刊例合作、行业交流 , 行业交流记得先加入 "机器人头条"知识星球 ,后添加( 微信号:lietou100w )微 信; 若有侵权、改稿请联系编辑运营(微信:li_sir_2020); 有关科技大厂入局具身赛道(大模型赋能、投资和自研)更多详细梳理、解读,已放到知识星 球"机器人头条"(点击后方链接,加入星球查看) : 【 原创】多家顶尖科技大厂,进军人形机器人整机制 ...
2025商用具身智能白皮书
艾瑞咨询· 2025-11-20 00:04
Core Insights - Embodied intelligence has gained significant traction globally, with Figure achieving a valuation of $39 billion despite zero revenue, while domestic players are securing commercial orders and projecting substantial revenue growth [1][2][9] - The Chinese government has integrated embodied intelligence into its strategic planning, emphasizing its importance in the industrial landscape [1][9] - The market for embodied intelligence is projected to reach trillions, with both China and the U.S. competing vigorously in this emerging sector [1][6] Definition and Understanding - Embodied intelligence is recognized as a crucial development in artificial intelligence, characterized by agents that interact with their environment through a physical body, showcasing autonomy and adaptability [2] - It represents a convergence of machine learning, computer vision, and robotics, marking a significant step towards practical AI applications [2] Commercial Applications - Different forms of embodied intelligence robots are evolving to meet diverse needs across sectors such as retail, dining, logistics, and healthcare [4] - Commercial applications focus on enhancing service experiences and operational flexibility in dynamic environments, while industrial applications prioritize efficiency and safety in structured settings [4] Strategic Importance - Embodied intelligence is pivotal in narrowing the technological gap between China and the U.S., driving innovation across various industrial sectors [6] - The competition in embodied intelligence is not only about economic benefits but also about enhancing national competitiveness and technological self-reliance [6] Policy Support - The Chinese government has actively promoted the development of embodied intelligence through various policies, funding initiatives, and standardization efforts [9] - Local governments are also implementing plans and pilot projects to support the industry, establishing funds and alliances to foster collaboration [9] Development Stages - The evolution of embodied intelligence can be categorized into three phases: conceptual development, technological accumulation, and application expansion driven by large models [11] - The current phase is characterized by rapid advancements, with the U.S. leveraging its computational advantages while China accelerates its catch-up through policy support and industrial collaboration [11] Bottlenecks and Challenges - The industry faces significant challenges, including data scarcity, technological maturity, and high costs associated with core components and computational resources [13][16] - The lack of high-quality operational data and the need for advancements in dexterous manipulation and generalization capabilities are critical hurdles [13] Data Acquisition and Solutions - Current data acquisition methods include remote operation, simulation, motion capture, and internet video, but high-quality data remains scarce [16] - The industry is exploring solutions such as "world models" and data collection training grounds to alleviate data challenges, with cities like Beijing and Shanghai accelerating the establishment of these facilities [19] Model Evolution - The VLA model is emerging as a consensus for development, integrating large language model reasoning with real-world perception and action capabilities [21] - This evolution is expected to lead to a significant leap in embodied intelligence capabilities, akin to the breakthroughs seen with large language models [21] Commercialization Trends - The commercialization of embodied intelligence is progressing through various stages, with initial applications focusing on low-complexity, high-ROI scenarios [31] - The industry is transitioning from hardware sales to service subscription models, indicating a shift towards more integrated business approaches [35] Global Market Outlook - The global market for embodied intelligence is anticipated to experience exponential growth, with projections indicating a market size of approximately 192 billion RMB by 2025 and a compound annual growth rate of 73% over the next five years [46] - China's market is expected to see significant growth, potentially reaching over 280 billion RMB by 2035, driven by a robust industrial ecosystem and competitive supply chains [50] Competitive Landscape - The competition in the embodied intelligence sector is characterized by three main players: AI-native challengers, traditional industrial players, and cross-industry giants [55] - The market is witnessing a trend towards consolidation, with product homogenization emerging as a concern, suggesting an impending wave of industry consolidation [57] Initial Players and Innovations - Companies like Tesla and Figure AI are leading the charge in developing humanoid robots, with Figure AI's valuation reaching $39 billion [64] - Innovations in dexterous manipulation and core component integration are critical for advancing the capabilities of humanoid robots [83][88]
优必选预计今年人形机器人营收4亿元,明年交付两千至三千台
Nan Fang Du Shi Bao· 2025-11-18 09:23
Core Insights - UBTECH Robotics has delivered approximately 200 humanoid robots in 2023, with an estimated revenue contribution of around 400 million yuan from this business. The company plans to deliver 500 units in 2025 and aims for 2,000 to 3,000 units in 2026 [1] - The company has set a production capacity ramp-up plan, targeting an annual production capacity of 5,000 humanoid robots by 2026 and expanding to 10,000 units by 2027 [1] - As of November 10, 2023, the total order value for the Walker series humanoid robots has exceeded 800 million yuan, primarily from clients in the automotive industry [1] Delivery and Production - UBTECH released a video showcasing the assembly of over a hundred humanoid robots, which faced skepticism from Figure AI's founder regarding its authenticity. UBTECH responded with a continuous shot video to counter the claims [2][4] - The company has conducted over a year of factory training for its humanoid robots, focusing on tasks such as handling, sorting, and quality inspection. The success rate for box handling tasks is reported at 99%, with an average handling time of 1.5 minutes, an improvement from 2 minutes in the first half of 2025 [4] Technological Developments - The VLA (Vision-Language-Action) model for intelligent operations is not yet commercially ready, with an accuracy of about 70%, which does not meet customer demands. The company plans to equip different tasks with corresponding specialized models [4] - A new version of the Walker series (Walker S3) is set to launch in the first half of 2026, featuring the Thor chip from NVIDIA, while another new model with high-performance mobility is also planned for release in the same timeframe [5]