Vision-Language-Action (VLA)

Search documents
VLA的论文占据具身方向的近一半......
具身智能之心· 2025-09-18 04:00
从今年各个机器人与AI顶会来看,VLA及其相关衍生方向,占据了近一半的具身产出。特别是长程操作、 泛化、少样本、VLA+RL、人形相关。 想象一下,如果能通过语言下达指令,并且丝滑执行任何你想要的动作,是一件多么幸福的事情!如果能 长时间连续动作完成,将会非常方便。下面给大家介绍下VLA到底是啥? VLA打破了传统方法的单任务局限,使得机器人能够在多样化的场景中自主决策,灵活应对未见过的环 境,广泛应用于制造业、物流和家庭服务等领域。此外,VLA模型已成为研究热点,推动了多个前沿项目 的发展,如pi0、RT-2、OpenVLA、QUAR-VLA和HumanVLA,这些研究促进了学术界与工业界的合作。 其适应性体现在能够应用于机械臂、四足机器人和人形机器人等多种平台,为各类智能机器人的发展提供 了广泛的潜力和实际应用价值,成为智能机器人领域的关键驱动力。 从产业角度看,国内外具身智能领域正处于蓬勃发展阶段,Unitree、智元、星海图、银河通用、逐际动力 等团队从实验室走向商业化,华为、京东、腾讯等科技巨头也积极布局,与国外Tesla、Figure AI等公司正 在一起推动这一领域的发展。 本课程聚焦于智能体如 ...
中国人形机器人_ 人工智能大会要点_ 轮式机器人演示比双足更常见,应用更广泛-China Humanoid Robot_ WAIC 2025 takeaways_ Broader applications with wheel-based robot demo more common than bipedal
2025-07-29 02:31
Summary of WAIC 2025 Takeaways Industry Overview - The conference showcased significant advancements in the AI and robotics industry, with a 35% increase in venue size to 70,000 sqm and a 31% increase in ticket prices to Rmb168 per day, featuring 800 exhibitors (up 60% year-over-year) and over 1,200 speakers [1][2]. Core Insights 1. **Application Scenarios**: There was a more targeted exploration of application scenarios across various sectors including manufacturing, logistics, retail, and elderly care, indicating a shift towards early commercialization [2][7]. 2. **Product Improvements**: Humanoid robots demonstrated meaningful product improvements, moving from static displays to engaging in interactive task demonstrations [2][8]. 3. **Prototype Trends**: A noticeable shift towards AGV-style wheeled bases was observed, suggesting a pragmatic approach to achieving near-term commercial viability, which may negatively impact stocks related to planetary roller screw components [2][9]. 4. **Cost Trends**: Cost curves for humanoid robots are decreasing but not significantly, with the lowest ASP reported at Rmb40,000 for Unitree's new model [2][14]. 5. **Manipulation Challenges**: Manipulation remains a core challenge, with issues around success rates, robustness, and reliability still prevalent [2][12]. Notable Exhibitors and Innovations - **Noematrix**: Showcased wheel-based prototypes performing various tasks, indicating a focus on practical applications [7][18]. - **Galbot**: Demonstrated retail automation robots capable of complex tasks, achieving efficiency levels comparable to human workers [17][18]. - **AgiBot**: Introduced multiple humanoid robots targeting various applications, including logistics and customer interaction [17]. - **Unitree**: Highlighted advancements in dynamic locomotion with their humanoid robots, showcasing improved autonomous capabilities [20]. Future Outlook - The exhibition reinforced a constructive view on humanoid robots as a long-term technology trend, with expectations for a technology inflection point approaching, although not yet realized [3][12]. - Upcoming updates from Tesla's Gen 3 Optimus are anticipated to be significant for the sector [3]. Investment Recommendations - **Sanhua Intelligent Controls**: Rated as a Buy due to growth potential in auto/EV thermal management and HVAC systems [21]. - **Zhejiang Supcon Technology Co.**: Also rated as a Buy, with strong market share in process automation and potential for vertical expansion [22]. - **Best Precision**: Neutral rating, with expectations of becoming a competitive supplier for humanoid robots [23]. - **Leader Harmonious Drive Systems**: Neutral rating, with potential growth in harmonic reduction gear applications [26]. - **Shanghai Baosight Software**: Neutral rating, with concerns over reliance on related-party transactions [27]. Conclusion The WAIC 2025 highlighted significant advancements in humanoid robotics, with a clear trend towards practical applications and commercialization. The investment landscape appears promising for select companies within the sector, although challenges remain in manipulation and cost efficiency.
对VLA的RL最新进展的梳理~
自动驾驶之心· 2025-07-03 12:41
Core Viewpoint - The article discusses the recent advancements in Vision-Language-Action (VLA) models, particularly focusing on the integration of Reinforcement Learning (RL) techniques to enhance their performance and stability in various tasks [1]. Group 1: Early Exploration of iRe-VLA - The core algorithm of iRe-VLA is PPO, which introduces a two-stage training paradigm to address instability in online reinforcement learning [2]. - The implementation utilizes BLIP-2 3B as the VLM backbone, replacing the final fully connected layer with an action head that includes a token learner and an MLP [2]. - The experimental setup involves simulation environments like Meatworld and Franka Kitchen, with tasks divided into three categories for evaluation [2]. Group 2: Preference Alignment with GRAPE - GRAPE introduces preference alignment into VLA training, specifically designed for VLA characteristics [6]. - The reward for each trajectory is composed of three parts: success reward, self-reward, and external reward based on a custom cost function [8]. - The external reward is calculated by decomposing trajectories into stages and evaluating them using a VLM task decomposer [9]. Group 3: LOOP and RIPT-VLA - LOOP combines RLOO and PPO to address challenges in sparse rewards and long sequences in multi-task scenarios [11]. - The RIPT-VLA employs the LOOP algorithm for online RL and provides open-source code for implementation [13]. - The approach includes various tricks to enhance training efficiency, such as dynamic rejection mechanisms and multi-task sampling [15]. Group 4: System and Algorithm Innovations in RL4VLA - RL4VLA models the action generation process as a multi-modal dialogue, using PPO training with dense pseudo-rewards to guide the training process [18]. - The training involves a Robotic Process Reward Model that predicts the likelihood of action sequences, enhancing the reward signal [20]. - The article emphasizes adaptive curriculum selection strategies to improve sample efficiency and generalization capabilities [21][23]. Group 5: Engineering Challenges and Future Directions - The article highlights the need for new RL algorithms suitable for VLA-RL, particularly addressing sparse reward issues and enhancing sample efficiency [30]. - It points out the engineering challenges in improving sampling efficiency and managing memory costs in VLA scenarios [30]. - The exploration of effective reward design and the implementation of RL in non-autoregressive VLA structures are identified as critical areas for future research [30].