机器人学习
Search documents
ICLR 2026|新版「图灵测试」:当VLA走进生物实验室
机器之心· 2026-02-19 23:43
近期,来自香港大学MMLAB 罗平老师团队和上海交大穆尧老师团队的工作 ——Autobio 正式被 ICLR 2026 接收,并获得了 8-8-6-6 的同行评议分数。AutoBio 是 一个面向数字化生物实验室的机器人仿真系统与基准测试平台。我们通过这篇工作,尝试系统性回答一个关键问题: 当前主流的视觉 - 语言 - 动作(Vision-Language-Action, VLA)模型,是否已经具备在真实生物实验室中执行实验流程的能力? 论文标题:AutoBio: A Simulation and Benchmark for Robotic Automation in Digital Biology Laboratory 论文链接:https://openreview.net/forum?id=UUE6HEtjhu 论文代码:https://github.com/autobio-bench/AutoBio https://huggingface.co/autobio-bench 一.研究背景:为何生物实验室构成关键挑战 现有 VLA 模型的研究和基准测试多局限于 家庭场景 (如整理餐桌、折叠衣物),缺乏对专业 ...
打破机器人“数据饥荒”僵局:锦秋被投企业星尘智能联合清华、MIT等发布CLAP框架|Jinqiu Spotlight
锦秋集· 2026-01-21 15:36
Core Insights - The article discusses the introduction of the Contrastive Latent Action Pretraining (CLAP) framework, which aims to address the data scarcity issue in robot learning by leveraging abundant human behavior videos from platforms like YouTube and Douyin [4][10]. Group 1: CLAP Framework Overview - The CLAP framework aligns the motion space extracted from videos with the action space of robots, effectively avoiding the "visual entanglement" problem commonly faced by existing latent action models [9][11]. - It utilizes a unified Visual-Language-Action (VLA) framework that combines the precision of machine data with the semantic diversity of large-scale unannotated human video demonstrations [14]. Group 2: Training Methodology - The research team developed two VLA modeling paradigms: CLAP-NTP, a self-regressive model excelling in instruction following and object generalization, and CLAP-RF, a strategy based on Rectified Flow aimed at high-frequency, fine-grained control [10][16]. - A knowledge matching (KM) regularization strategy is introduced to mitigate catastrophic forgetting during the fine-tuning process, ensuring that robots retain previously learned skills while acquiring new ones [11][16]. Group 3: Experimental Results - Extensive experiments demonstrate that CLAP significantly outperforms strong baseline methods, enabling effective skill transfer from human videos to robot execution [18]. - Performance comparisons in real-world tasks show that CLAP-NTP and CLAP-RF achieve success rates of 90% and 85% respectively in pick-and-place tasks, indicating superior capabilities [20]. - Robustness evaluations reveal that CLAP-RF maintains a mean success rate of 66.7% under environmental perturbations, showcasing its resilience [21].
你的模型真的能打吗?操作任务的长尾场景评测来了
具身智能之心· 2026-01-20 00:33
Core Viewpoint - The article discusses the introduction of the GM-100 benchmark test, which aims to enhance the evaluation of robotic capabilities through a diverse set of 100 tasks designed to address the limitations of existing datasets and task designs in the field of robotics [1][4]. Group 1: Background and Motivation - The rapid development of robotic learning has led to the emergence of various datasets and task designs, but many focus on common tasks, resulting in a lack of coverage for complex and rare tasks [3][5]. - Existing datasets, such as Open X-Embodiment and Agibot, primarily concentrate on common actions like "pick and grasp," leading to significant biases in trained models and limiting their applicability in real-world scenarios [3][5]. Group 2: GM-100 Benchmark Test - The GM-100 benchmark consists of 100 carefully designed tasks that encompass various interaction scenarios and long-tail behaviors, aiming to provide a comprehensive assessment of robotic agents' capabilities [4][11]. - The tasks are developed based on systematic analysis and insights from human action understanding, ensuring they are executable and sufficiently challenging to differentiate the performance of various models [2][4]. Group 3: Task Design and Data Collection - The task design process involved analyzing previous research to eliminate redundancies and categorize tasks, revealing a significant bias towards common activities [5][9]. - A diverse set of tasks was generated using large language models, with human experts involved in the final selection to ensure high-quality and feasible tasks for current hardware constraints [10][11]. - Data collection for GM-100 was conducted through teleoperation, resulting in a medium-sized dataset with over 13,000 trajectories [13][16]. Group 4: Evaluation Metrics and Results - The evaluation of different baseline models on GM-100 tasks utilized several metrics, including Success Rate (SR), Partial Success Rate (PSR), and action prediction error, to provide a comprehensive performance assessment [22]. - The results indicated that the overall success rate was low, highlighting the inherent challenges of the tasks and the limitations of the training data [22].
你的模型真的能打吗?上交发布了近百项场景的GM-100,操作任务的长尾场景评测来了
具身智能之心· 2026-01-19 09:30
Core Viewpoint - The article discusses the limitations of existing robot learning datasets and task designs, emphasizing the need for a more systematic approach to evaluate and enhance robot capabilities through the introduction of the GM-100 benchmark test [2][4]. Group 1: Background and Issues - The rapid development of robot learning has led to numerous datasets and task designs, but many focus on common tasks, lacking coverage for complex and rare tasks [3][5]. - Existing evaluations often rely on a few common tasks, making it difficult to compare different research outcomes fairly [3][5]. Group 2: GM-100 Benchmark Test - The GM-100 benchmark test consists of 100 carefully designed tasks that cover various interaction scenarios and long-tail behaviors, aiming to provide a diverse and challenging task set for evaluating robot capabilities [4][11]. - The tasks were developed through systematic analysis and expansion of existing designs, incorporating insights from human action understanding [4][9]. Group 3: Task Design and Data Collection - The design of GM-100 tasks was based on human action rationality, ensuring a wide range of interaction scenarios and the inclusion of rare but important actions [9][10]. - A medium-sized dataset of over 13,000 trajectories was collected through teleoperation across two different robot platforms, ensuring diverse data for evaluation [11][13][16]. Group 4: Evaluation Metrics - The evaluation of models on GM-100 tasks uses several metrics, including Success Rate (SR), Partial Success Rate (PSR), and action prediction error, to provide a comprehensive assessment of robot performance [22]. - The overall success rate of the benchmark is low, highlighting the inherent challenges of the tasks and the limitations of the current data constraints [22].
Physical Intelligence内部员工分享(从数采到VLA再到RL)
自动驾驶之心· 2025-12-25 09:33
Core Insights - The article discusses the current state of robot learning as of December 2025, emphasizing that most systems rely on behavior cloning (BC) and the challenges associated with it [8][41]. - It highlights the importance of human demonstrations in training robot learning systems and the need for innovative solutions to improve robustness and efficiency [74]. Group 1: Behavior Cloning and Its Challenges - Behavior cloning systems require high-quality data from human demonstrations, which are often slow to collect and expensive to scale [12][22]. - The primary issues with behavior cloning include the inability to generalize beyond the training data, leading to performance degradation in out-of-distribution (OOD) states [20][26]. - The article outlines the necessity of developing models that can recover from failure states and adapt to new situations, suggesting a DAgger-style approach to training [30][36]. Group 2: Future Directions in Robot Learning - The article predicts that human demonstrations will remain crucial for the foreseeable future, with a call for the development of integrated hardware and software systems to streamline the training process [74]. - It anticipates that within two years, video model backbones will replace current VLA systems, and within ten years, world models will effectively simulate general open-world interaction strategies [75]. - The need for real robot rollouts is emphasized as essential for achieving superhuman performance, indicating that traditional simulation methods may not suffice [75]. Group 3: Industry Implications - The article suggests that companies focusing on creating effective human demonstration systems will become attractive partners or acquisition targets in the robotics industry [74]. - It warns that data labeling and pre-training data sales are highly commoditized and require operational excellence to succeed [75]. - The importance of internal evaluation processes is highlighted, as they are critical for model improvement and cannot be outsourced [75].
机器人学习现状!PI团队内部员工分享(从数采到VLA再到RL)
具身智能之心· 2025-12-23 00:03
Core Insights - The article discusses the current state of robot learning as of December 2025, emphasizing that most systems rely on behavior cloning (BC) and the challenges associated with it [5][40][39] - It highlights the importance of human demonstrations in training robot learning systems and the need for innovative approaches to improve performance and robustness [72][73] Group 1: Behavior Cloning and Its Challenges - As of December 2025, all robot learning systems primarily utilize behavior cloning, where human demonstrations are used to train models to mimic actions [5][6] - The challenges of behavior cloning include the inability to generalize beyond the training data, leading to performance issues in real-world applications [16][21][23] - The article outlines the difficulties in collecting high-quality demonstration data and the need for diverse and representative datasets to improve model training [7][12][19] Group 2: Future Directions and Innovations - The article predicts that within two years, video models will replace current visual-language architectures in robot learning [72] - It suggests that world models will effectively simulate general open-world interactions within ten years, enhancing the capabilities of robot learning systems [72] - The need for a robust human demonstration system that can effectively address the challenges of data collection and model training is emphasized as a key area for future development [73][76]
机器人学习现状!Physical Intelligence内部员工分享(从数采到VLA再到RL)
具身智能之心· 2025-12-20 16:03
Core Insights - The article discusses the current state of robot learning as of December 2025, emphasizing that most systems rely on behavior cloning (BC) and the challenges associated with it [5][40][39] - It highlights the importance of data collection from human demonstrations and the limitations of existing methods in achieving robust performance in real-world applications [6][10][12] Group 1: Behavior Cloning and Its Challenges - As of December 2025, all robot learning systems primarily utilize behavior cloning, where human demonstrations are used to train models to mimic actions [5] - The data for behavior cloning comes from human demonstrations and various other sources, but the need for extensive data collection poses significant challenges [7][10] - The limitations of behavior cloning include the inability to generalize well to out-of-distribution (OOD) states, leading to performance degradation in real-world scenarios [16][23][40] Group 2: Data Collection Methods - Data collection methods include using human operators with smart demo gloves and video platforms to gather diverse task execution data [11][13] - The challenges in data collection include ensuring the data is representative of the tasks and the need for extensive training for operators to provide usable data [9][10] - The article emphasizes the importance of high-quality data for training models and the difficulties in achieving this at scale [10][19] Group 3: Future Directions in Robot Learning - The article predicts that within two years, video model backbones will replace current VLA methods, and within ten years, world models will effectively simulate general open-world interactions [73] - It suggests that traditional simulation and game engines will serve as data generators for world models, emphasizing the continued importance of expert demonstration data [73] - The need for robust Q/V functions that can operate effectively in OOD states is highlighted as a critical area for future research [72]
HuggingFace联合牛津大学新教程开源SOTA资源库!
具身智能之心· 2025-10-27 00:02
Core Viewpoint - The article emphasizes the significant advancements in robotics, particularly in robot learning, driven by the development of large models and multi-modal AI technologies, which have transformed traditional robotics into a more learning-based paradigm [3][4]. Group 1: Introduction to Robot Learning - The article introduces a comprehensive tutorial on modern robot learning, covering foundational principles of reinforcement learning and imitation learning, leading to the development of general-purpose, language-conditioned models [4][12]. - HuggingFace and Oxford University researchers have created a valuable resource for newcomers to the field, providing an accessible guide to robot learning [3][4]. Group 2: Classic Robotics - Classic robotics relies on explicit modeling through kinematics and control planning, while learning-based methods utilize deep reinforcement learning and expert demonstration for implicit modeling [15]. - Traditional robotic systems follow a modular pipeline, including perception, state estimation, planning, and control [16]. Group 3: Learning-Based Robotics - Learning-based robotics integrates perception and control more closely, adapts to tasks and entities, and reduces the need for expert modeling [26]. - The tutorial highlights the challenges of safety and efficiency in real-world applications, particularly during the initial training phases, and discusses advanced techniques like simulation training and domain randomization to mitigate risks [34][35]. Group 4: Reinforcement Learning - Reinforcement learning allows robots to autonomously learn optimal behavior strategies through trial and error, showcasing significant potential in various scenarios [28]. - The tutorial discusses the complexity of integrating multiple system components and the limitations of traditional physics-based models, which often oversimplify real-world phenomena [30]. Group 5: Imitation Learning - Imitation learning offers a more direct learning path for robots by replicating expert actions through behavior cloning, avoiding complex reward function designs [41]. - The tutorial addresses challenges such as compound errors and handling multi-modal behaviors in expert demonstrations [41][42]. Group 6: Advanced Techniques in Imitation Learning - The article introduces advanced imitation learning methods based on generative models, such as Action Chunking with Transformers (ACT) and Diffusion Policy, which effectively model multi-modal data [43][45]. - Diffusion Policy demonstrates strong performance in various tasks with minimal demonstration data, requiring only 50-150 demonstrations for training [45]. Group 7: General Robot Policies - The tutorial envisions the development of general robot policies capable of operating across tasks and devices, inspired by large-scale open robot datasets and powerful visual-language models [52][53]. - Two cutting-edge visual-language-action (VLA) models, π₀ and SmolVLA, are highlighted for their ability to understand visual and language instructions and generate precise control commands [53][56]. Group 8: Model Efficiency - SmolVLA represents a trend towards model miniaturization and open-sourcing, achieving high performance with significantly reduced parameter counts and memory consumption compared to π₀ [56][58].
手把手带你入门机器人学习,HuggingFace联合牛津大学新教程开源SOTA资源库
机器之心· 2025-10-26 07:00
Core Viewpoint - The article emphasizes the significant advancements in the field of robotics, particularly in robot learning, driven by the development of artificial intelligence technologies such as large models and multi-modal models. This shift has transformed traditional robotics into a learning-based paradigm, opening new potentials for autonomous decision-making robots [2]. Group 1: Introduction to Robot Learning - The article highlights the evolution of robotics from explicit modeling to implicit modeling, marking a fundamental change in motion generation methods. Traditional robotics relied on explicit modeling, while learning-based methods utilize deep reinforcement learning and expert demonstration learning for implicit modeling [15]. - A comprehensive tutorial provided by HuggingFace and researchers from Oxford University serves as a valuable resource for newcomers to modern robot learning, covering foundational principles of reinforcement learning and imitation learning [3][4]. Group 2: Learning-Based Robotics - Learning-based robotics simplifies the process from perception to action by training a unified high-level controller that can directly handle high-dimensional, unstructured perception-motion information without relying on a dynamics model [33]. - The tutorial addresses challenges in real-world applications, such as safety and efficiency issues during initial training phases, and high trial-and-error costs in physical environments. It introduces advanced techniques like simulator training and domain randomization to mitigate these risks [34][35]. Group 3: Reinforcement Learning - Reinforcement learning allows robots to autonomously learn optimal behavior strategies through trial and error, showcasing significant potential across various scenarios [28]. - The tutorial discusses the "Offline-to-Online" reinforcement learning framework, which enhances sample efficiency and safety by utilizing pre-collected expert data. The HIL-SERL method exemplifies this approach, enabling robots to master complex real-world tasks with near 100% success rates in just 1-2 hours of training [36][39]. Group 4: Imitation Learning - Imitation learning offers a more direct learning path for robots by replicating expert actions through behavior cloning, avoiding complex reward function designs and ensuring training safety [41]. - The tutorial presents advanced imitation learning methods based on generative models, such as Action Chunking with Transformers (ACT) and Diffusion Policy, which effectively model multi-modal data by learning the latent distribution of expert behaviors [42][43]. Group 5: Universal Robot Policies - The article envisions the future of robotics in developing universal robot policies capable of operating across tasks and devices, inspired by the emergence of large-scale open robot datasets and powerful visual-language models (VLMs) [52]. - Two cutting-edge VLA models, π₀ and SmolVLA, are highlighted for their ability to understand visual and language instructions and generate precise robot control commands, with SmolVLA being a compact, open-source model that significantly reduces application barriers [53][56].
无需再训练即可增强性能!港大团队提出GPC框架,实现机器人「策略组合」
机器之心· 2025-10-19 09:17
Core Viewpoint - The article introduces the General Policy Composition (GPC) framework, which provides a novel, training-free solution to enhance the performance of robot control strategies by dynamically combining multiple pre-trained models during test time, thus overcoming the limitations of traditional training methods [2][5][7]. Summary by Sections Improving Policy Performance - GPC presents a paradigm shift in enhancing policy performance without relying on additional training, instead utilizing a method of combining existing strategies [6][15]. Innovative Theoretical Foundation - The framework is built on two key theoretical findings: 1. Functional-Level Improvement, which shows that convex combinations of decision scores from multiple pre-trained strategies can yield a more accurate combined score than any single strategy [9]. 2. System-Level Stability, which ensures that improvements in single-step errors propagate throughout the entire trajectory, leading to overall performance enhancement [10]. General "Policy Composer" - GPC's core advantage lies in its plug-and-play nature, allowing for the seamless integration of various robot strategies without the need for retraining [14][15]. Heterogeneous Strategy Flexibility - GPC can flexibly combine strategies across different architectures and modalities, effectively balancing information from various conditions to produce stable and coherent action trajectories [17][19]. Weight Search for Optimal Strategy - The weight search mechanism in GPC customizes optimal weight configurations for different tasks, emphasizing the importance of weight distribution in maximizing the effectiveness of the combined strategy [22][23]. Experimental Validation - GPC has demonstrated superior performance in both simulation and real-world environments, achieving significant success rate improvements over single baseline methods, with up to 7.55% in simulation tasks and 5-10% in real-world tasks [28][30]. Key Findings from Experiments - Three core findings from experiments highlight GPC's versatility: 1. GPC can achieve higher accuracy when combining strategies with moderate accuracy levels [29]. 2. The presence of a weak strategy can hinder overall performance, indicating the need for careful selection of contributing strategies [29]. 3. Performance is maximized when stronger strategies are given greater weight in the combination [29].