Core Viewpoint - Google DeepMind has launched the Gemini Robotics 1.5 series, marking a significant milestone in the development of general AI for real-world applications, featuring embodied reasoning capabilities that allow robots to "think before acting" [1][9]. Group 1: Model Composition - The Gemini Robotics 1.5 series consists of two main models: GR 1.5 for action execution and GR-ER 1.5 for embodied reasoning [2][8]. - GR-ER 1.5 is the world's first embodied model with simulated reasoning capabilities [3]. Group 2: Functional Capabilities - The combination of GR-ER 1.5 and GR 1.5 enables robots to perform complex multi-step tasks, such as sorting clothes by color or packing luggage based on weather conditions [5][6]. - GR 1.5 can adapt to various robot hardware, allowing a single model to operate across different platforms without the need for separate training [16][18]. Group 3: Motion Transfer Mechanism - The innovative "Motion Transfer" mechanism allows skills learned on one robot to be transferred to another, enhancing cross-platform functionality [21][48]. - This mechanism abstracts different robot actions into a unified semantic space, enabling seamless skill sharing across diverse hardware [56]. Group 4: Safety and Explainability - The GR 1.5 series enhances safety by allowing robots to self-correct during tasks and recognize potential risks, ensuring safe operation in human environments [34][36]. - The embodied reasoning model provides transparency in the robot's decision-making process, improving interpretability and trust [55][58]. Group 5: Performance Metrics - In benchmark tests, GR 1.5 outperformed previous models in various dimensions, including instruction generalization and task completion rates, achieving nearly 80% in long-sequence tasks [61][62]. - The model demonstrated unprecedented zero-shot transfer capabilities in cross-robot migration tests [63]. Group 6: Future Developments - The GR 1.5 series represents a shift from executing single commands to genuinely understanding and solving physical tasks [69]. - Currently, developers can access GR-ER 1.5 through Google AI Studio, while GR 1.5 is available to select partners [71].
首款推理具身模型,谷歌DeepMind造!自主理解/规划/执行复杂任务,打破一机一训,还能互相0样本迁移技能