宇树开源 UnifoLM-VLA-0

Core Viewpoint - Yushu announced the open-source release of UnifoLM-VLA-0, a visual-language-action (VLA) large model aimed at general humanoid robot operations, which seeks to overcome the limitations of traditional visual-language models (VLM) in physical interactions [1] Group 1 - The UnifoLM-VLA-0 model is part of the UnifoLM series and focuses on enhancing physical interaction capabilities in robotics [1] - The model achieves an evolution from general "image-text understanding" to a "embodied brain" with physical common sense through continued pre-training on robotic operation data [1]