分层快慢脑 VLA 架构

Search documents
机器人大模型深度:我们距离真正的具身智能大模型还有多远?
2025-08-12 15:05
Summary of Key Points from Conference Call Records Industry Overview - The focus is on the humanoid robot industry and the development of large-scale intelligent models, particularly in the context of data collection and algorithm training [1][2][3]. Core Insights and Arguments - **Data Flywheel**: The data flywheel is essential for the maturation of large intelligent models, requiring a sufficient number of robots in factories to collect data for improvement [3]. - **Model Development**: The humanoid robot industry faces challenges primarily at the model level, with multi-modal large models being crucial for advancement [2]. - **Current Model Stage**: Humanoid robots are currently at the L2 stage, analogous to the early stages of autonomous driving, where hardware must be established before data collection can effectively begin [5]. - **Key Development Lines**: The development of large intelligent models is driven by three main lines: multi-modality, action frequency, and generalization ability [6]. Important but Overlooked Content - **Data Collection Challenges**: True machine data is of the highest quality but is costly and inefficient to collect, leading to potential sunk costs if hardware is not finalized [15]. - **Simulation vs. Real Data**: The current ratio of simulation data to real machine data is approximately 9:1, with a trend towards a more balanced approach in the future [16]. - **Action Capture Technologies**: There are two main types of motion capture technologies: optical and inertial, each with distinct applications and cost structures [17]. - **Recommended Companies**: Companies recommended for investment include Lingyun Optical for motion capture equipment, Aowei Zhongguang for cameras, and Danghong Technology and Jingye Intelligent for remote operation technology [22]. Future Directions - **Integration of Modalities**: Future large models are expected to incorporate more modalities, including tactile and olfactory information, alongside existing visual and language inputs [19]. - **Remote Operation Technology**: This technology is crucial for ensuring real-time robot operation and is expected to see significant demand in both mid-term and long-term applications [21].