击败英伟达,全球四项第一!优必选自研人形机器人最强大脑Thinker登顶全球!
机器人圈·2025-09-10 09:07

Core Viewpoint - UBTECH's humanoid robot Walker has achieved significant advancements with its self-developed multimodal large model, Thinker, which has excelled in three major international benchmark tests, showcasing its leading capabilities in complex environment perception, semantic understanding, and long-term task planning [2][4]. Group 1: Benchmark Achievements - UBTECH's Thinker model ranked first in four global leaderboard categories across three authoritative benchmark tests: MS COCO Detection Challenge, RoboVQA, and Egoplan-bench2, competing against top teams from NVIDIA, Beijing Academy of Artificial Intelligence, and Shanghai AI Lab [2][4]. - The MS COCO Detection Challenge is recognized as a key evaluation standard in the computer vision field, while RoboVQA and Egoplan-bench2 focus on reasoning and task planning from a robot's perspective [4][5]. Group 2: Technical Innovations - The Thinker architecture integrates several key technological innovations, enhancing the humanoid robot's perception and reasoning capabilities, laying the groundwork for large-scale applications in industrial settings [6]. - A self-developed visual encoder based on ViT and Co-DETR detection head has been utilized to improve environmental perception, significantly enhancing the robot's ability to recognize objects and obstacles in complex environments [7]. - The large-scale parameter architecture of Thinker, with billions of parameters, enables robust semantic understanding, allowing the robot to accurately capture environmental details and comprehend task instructions [7]. - Temporal enhancement algorithms and reinforcement learning methods have been employed to improve long-term task planning, enabling the robot to autonomously decompose complex processes in real-time [7]. Group 3: Industrial Application Strategies - The strategy of "building general foundational capabilities + fine-tuning for industrial scenarios" is crucial for advancing multimodal large models towards practical applications, facilitating stable and efficient deployment of humanoid robots on production lines [9][11]. - The model has been trained on over 2 million video data and fine-tuned with a large industrial dataset, significantly improving the robot's understanding accuracy and decision reliability in industrial environments [11][12]. Group 4: Future Development and Collaboration - UBTECH aims to build an open and collaborative application ecosystem for humanoid robots by gradually open-sourcing valuable industrial datasets and foundational large models, enabling developers to enhance efficiency in various new scenarios [14].