SIASUN-达摩院开源具身智能“三大件” 机器人上下文协议首次开源

Core Insights - Alibaba's Damo Academy announced the open-source release of several models and protocols aimed at enhancing the compatibility and adaptability of data, models, and robots in the field of embodied intelligence [1][3] - The introduction of the Robotics Context Protocol (RynnRCP) aims to address challenges such as fragmented development processes and difficulties in adapting data and models to robotic systems [1][2] Group 1: Open-source Models and Protocols - The RynnVLA-001 model is a visual-language-action model that learns human operational skills from first-person perspective videos, enabling smoother robotic arm control [3] - The RynnEC model integrates multi-modal large language capabilities, allowing for comprehensive scene analysis across 11 dimensions, enhancing object localization and segmentation in complex environments [3] - RynnRCP serves as a complete robot service protocol and framework, facilitating the workflow from sensor data collection to model inference and robotic action execution [1][2] Group 2: Technical Framework and Features - The RCP framework within RynnRCP establishes connections between robotic bodies and sensors, providing standardized capability interfaces and compatibility across different transport layers and model services [2] - The RobotMotion module acts as a bridge between large models and robotic control, converting low-frequency inference commands into high-frequency continuous control signals for smoother robotic movements [2] - The integrated simulation-real machine control tool within RobotMotion aids developers in quickly adapting to tasks, supporting simulation synchronization, data collection, playback, and trajectory visualization [2] Group 3: Industry Engagement and Development - Damo Academy is actively investing in embodied intelligence, focusing on system and model development while collaborating with various stakeholders to build industry infrastructure [3] - The recent open-sourcing of the WorldVLA model, which merges world models with action models, has garnered significant attention for its enhanced understanding and generation capabilities in images and actions [3]