仿真到现实
Search documents
OpenAI回归机器人:想把大模型推向物理世界
3 6 Ke· 2025-09-17 11:12
Core Insights - OpenAI is refocusing its research and recruitment efforts on "embodied intelligence," particularly in humanoid systems, after a pause of several years [1][4] - The company is building a robotics research matrix aimed at real-world applications, indicating a shift from purely algorithmic development to hardware integration [1][4] Recruitment and Talent Acquisition - OpenAI has been actively recruiting talent with backgrounds in humanoid robotics and physical control algorithms, emphasizing teleoperation and simulation tools like Nvidia Isaac [3][8] - Job postings highlight the need for experience in designing mechanical systems for high-volume production, suggesting a focus on scalable robotics solutions [3][8] Strategic Direction - The appointment of Caitlin Kalinowski, former head of AR hardware at Meta, to lead robotics and consumer hardware initiatives signals a strong commitment to the robotics sector [4] - OpenAI's previous achievements in robotics, such as the Dactyl robotic hand, demonstrate its capability in sim-to-real applications, which the company is now revisiting [6] Technical Capabilities - OpenAI aims to extend its general model's understanding and reasoning to a complete loop of perception and control, requiring capabilities in data collection, model optimization, and hardware design [8] - The company is focusing on large-scale reinforcement learning and real-time inference to enhance the stability and timing of perception-control systems [8] Market Context - The humanoid robotics sector is competitive, with significant investments exceeding $5 billion since 2024, and a projected trillion-dollar market by 2050 [9] - OpenAI's recent adjustments in computing power, funding, and governance, including a new non-binding memorandum with Microsoft, may influence its robotics development pace and external collaborations [9]
手把手教机器人:斯坦福大学提出RTR框架,让机械臂助力人形机器人真机训练
机器之心· 2025-08-27 00:46
Core Viewpoint - The application of reinforcement learning (RL) algorithms in humanoid robot motion control is emerging as a key research area, with a focus on the "Sim-to-Real" paradigm, which aims to train general control models in diverse simulated environments to adapt to the real world [2][3]. Group 1: Current Challenges and Innovations - Existing methods primarily utilize domain randomization to train models in simulation, achieving impressive results in various tasks but often sacrificing performance in specific real-world environments [2][3]. - Recent efforts have begun to explore fine-tuning models with limited real-world data after simulation pre-training, with notable contributions from institutions like NVIDIA and CMU [3]. - The challenge of conducting RL training in real environments has been a significant barrier due to the instability of humanoid robots, which can lead to hardware damage from minor errors [3]. Group 2: Proposed Solution - RTR System - The RTR (Robot-Trains-Robot) system introduces a novel approach where a "teacher" robotic arm guides a "student" humanoid robot through online reinforcement learning, inspired by how human parents teach infants to walk [4][6]. - The teacher arm plays multiple roles: it provides safety support, assists in resetting the student after failures, collects valuable training data, and sets a curriculum to enhance learning efficiency [5][6]. Group 3: Hardware and Algorithm Design - The RTR system consists of a hardware setup with a teacher and student robot, where the teacher is a UR5 robotic arm equipped with force-torque sensors, and the student is based on the open-source ToddlerBot [8][9]. - The system's algorithm involves a three-stage Sim-to-Real process: training adaptable strategies in simulation, optimizing a general initial latent variable, and performing online fine-tuning in the real world with minimal data [9][11]. Group 4: Experimental Validation - Experiments demonstrated the effectiveness of the RTR system in tasks like walking and swinging, showing that the teacher's flexible assistance significantly improves learning outcomes compared to fixed supports [15][19]. - The proposed fine-tuning method using latent variables outperformed traditional methods in data efficiency and final performance, achieving a twofold speed increase in walking strategies with just 20 minutes of real-world training [15][18]. Group 5: Future Prospects - The RTR framework not only addresses the current challenges in deploying humanoid robots but also introduces a new paradigm of physical assistance for real-world learning, with potential applications in larger humanoid robots and other complex robotic systems [17].