让机器人学会“自主干活”,北京人形开源具身小脑XR-1模型

Core Insights - Beijing Humanoid Robot Innovation Center has officially open-sourced the XR-1 model, which is the first and only embodied VLA (Vision-Language-Action) large model in China to pass the national standard test for embodied intelligence [1] - The XR-1 model addresses the core pain points in the embodied intelligence industry, specifically the disconnection between visual perception and action execution, enabling robots to adapt to environmental changes [1] - The newly released RoboMIND 2.0 dataset includes over 300,000 robot operation trajectory data points and expands to 11 application scenarios, significantly enhancing the training and success rate of robots [3] Group 1 - The XR-1 model features a "knowledge-action unity" capability, utilizing cross-data source learning, cross-modal alignment, and cross-ontology control as its technical core [1] - The UVMC (Unified Representation of Multi-modal Vision and Action) technology allows robots to respond instinctively to visual stimuli, akin to human reflexes [1] - The RoboMIND dataset has been downloaded over 150,000 times, indicating strong interest and utility in the robotics community [3] Group 2 - Robots equipped with the XR-1 model have successfully completed complex material handling tasks and achieved full autonomous operation in a test involving passing through five doors [3] - Beijing Humanoid has partnered with various organizations, including Li Ning and Bayer, to deploy humanoid robots across different industries, enhancing their application in manufacturing and safety inspections [5] - The ArtVIP dataset provides over 1,000 high-fidelity digital twin articulated objects, improving the interaction capabilities of robots in various scenarios [3]