Workflow
仿真到现实
icon
Search documents
OpenAI回归机器人:想把大模型推向物理世界
3 6 Ke· 2025-09-17 11:12
此外, Simulation Environments Engineer, Robotics 的岗位直接点名 遥操作/硬件在环(HIL)与Nvidia Isaac 等仿真生态,强调将 大规模强化学习 与 GPU 管线优化 落地到机器人任务场景。这与 WIRED 的技术路径描述一致,构成相互印证。 早在 2024 年 11 月,前 Meta AR 眼镜硬件负责人 Caitlin Kalinowski 加入 OpenAI,负责 机器人与消费硬件 方向。此举被多家媒体视为 OpenAI 重返机器 人赛道的强烈信号,也提示其机器人战略并非"只做算法"。 在暂停数年后,OpenAI 正将研究与招聘资源重新投向"具身智能",并把焦点进一步推向人形系统。多份权威报道、公开招聘信息与产业动向交叉印证: 这家以大模型闻名的公司,正在搭建一个面向现实世界的机器人研发矩阵。 WIRED 9 月 15 日的报道,OpenAI 近来密集招募具有人形机器人与物理控制算法背景的科研人才,并在训练路径上强调遥操作(teleoperation)与仿真 (包括 Nvidia Isaac 等工具);公司是否自建硬件或与外部制造商合作仍未明朗,但" ...
手把手教机器人:斯坦福大学提出RTR框架,让机械臂助力人形机器人真机训练
机器之心· 2025-08-27 00:46
Core Viewpoint - The application of reinforcement learning (RL) algorithms in humanoid robot motion control is emerging as a key research area, with a focus on the "Sim-to-Real" paradigm, which aims to train general control models in diverse simulated environments to adapt to the real world [2][3]. Group 1: Current Challenges and Innovations - Existing methods primarily utilize domain randomization to train models in simulation, achieving impressive results in various tasks but often sacrificing performance in specific real-world environments [2][3]. - Recent efforts have begun to explore fine-tuning models with limited real-world data after simulation pre-training, with notable contributions from institutions like NVIDIA and CMU [3]. - The challenge of conducting RL training in real environments has been a significant barrier due to the instability of humanoid robots, which can lead to hardware damage from minor errors [3]. Group 2: Proposed Solution - RTR System - The RTR (Robot-Trains-Robot) system introduces a novel approach where a "teacher" robotic arm guides a "student" humanoid robot through online reinforcement learning, inspired by how human parents teach infants to walk [4][6]. - The teacher arm plays multiple roles: it provides safety support, assists in resetting the student after failures, collects valuable training data, and sets a curriculum to enhance learning efficiency [5][6]. Group 3: Hardware and Algorithm Design - The RTR system consists of a hardware setup with a teacher and student robot, where the teacher is a UR5 robotic arm equipped with force-torque sensors, and the student is based on the open-source ToddlerBot [8][9]. - The system's algorithm involves a three-stage Sim-to-Real process: training adaptable strategies in simulation, optimizing a general initial latent variable, and performing online fine-tuning in the real world with minimal data [9][11]. Group 4: Experimental Validation - Experiments demonstrated the effectiveness of the RTR system in tasks like walking and swinging, showing that the teacher's flexible assistance significantly improves learning outcomes compared to fixed supports [15][19]. - The proposed fine-tuning method using latent variables outperformed traditional methods in data efficiency and final performance, achieving a twofold speed increase in walking strategies with just 20 minutes of real-world training [15][18]. Group 5: Future Prospects - The RTR framework not only addresses the current challenges in deploying humanoid robots but also introduces a new paradigm of physical assistance for real-world learning, with potential applications in larger humanoid robots and other complex robotic systems [17].