视觉语言动作（VLA）模型 - filings, earnings calls, financial reports, news

视觉语言动作（VLA）模型

Search documents

Hu Xiu· 2025-06-28 06:50

Core Insights - Google's Gemini Robotics On-Device model showcases the ability of robots to adapt quickly to new tasks and environments without continuous internet connectivity, marking a significant advancement in offline AI robotics [3][5][16] - The model is designed to enhance the efficiency and speed of robots in performing tasks through a "visual-language-action" framework, allowing for robust performance even in intermittent connectivity scenarios [3][5][19] Group 1: Model Features and Performance - The Gemini Robotics On-Device model was launched on June 24 and is the first of its kind to operate independently of data networks, which is beneficial for latency-sensitive applications [3][5] - It addresses three main challenges: dexterous manipulation, fine-tuning for new tasks, and low-latency reasoning based on local operation [5][12] - In demonstrations, the model successfully completed tasks such as placing blocks and opening drawers using natural language commands, indicating strong visual, semantic, and behavioral generalization capabilities [8][10] Group 2: Comparison with Other Technologies - The Gemini Robotics On-Device model, while slightly lower in performance than the flagship Gemini Robotics model, significantly outperforms previous best offline models [8][10] - It offers developers the option to fine-tune the model with as few as 50 to 100 demonstrations, enhancing its adaptability to new tasks [12][14] - The model has been tested on various robotic platforms, including the dual-arm Franka and Apptronik's Apollo humanoid robot, demonstrating its versatility in handling previously unseen objects and tasks [14][17] Group 3: Industry Context and Implications - The advancements in Gemini Robotics highlight the competitive landscape in the robotics and embodied intelligence sector, where various companies are exploring diverse technological approaches to enable AI to understand and interact with the physical world [19] - The ongoing developments suggest a potential shift in the robotics industry, with Google's offline AI robots being seen as game-changers by some observers [16][19] - The discourse around the technology raises questions about its differentiation from competitors like Tesla and Meta, indicating a vibrant and competitive environment in AI robotics [18][19]

具身智能

视觉语言动作（VLA）模型

人工智能

Gemini Robotics On-Device

Gemini Robotics On-Device

Helix

3个月斩获两轮数亿融资，头部具身智能机器人创企迎技术、商业化双重突破！

Robot猎场备忘录· 2025-04-21 02:38

温馨提示：点击下方图片，查看运营团队2025年最新原创报告（共210页）说明：欢迎约稿、刊例合作、行业人士交流，行业交流记得先加入 "机器人头条"知识星球，后添加（微信号：lietou100w ）微信；若有侵权、改稿请联系编辑运营（微信：li_sir_2020）；正文： 2025年2月20日，国外知名人形机器人独角兽公司【Figure AI】推出自研通用型视觉语言动作(VLA)模型— Helix ，并开创性采用双系统架构（负责"慢思考"，处理高层语义和目标规划 S2和负责"快反应"，实时执行和调整动作 S1 ），开启双系统架构VLA模型先河，专为高频率、灵巧控制整个人形机器人上半身而设计。 2025年2月26日，作为国外最早提出视觉语言动作(VLA)模型，拥有全球具身智能领域"最强创始团队的具身智能大模型初创公司[Physical Intelligence]（简称 PI或 π ）基于其公司端到端大模型π0（ pi-zero）推出"分层交互式机器人"系统（全称：Hierarchical Interactive Robot ，简称Hi Robot），它允许整合VLA模型，例 ...