理想郎咸朋长文分享为什么关于VLA与宇树王兴兴观点不一致
理想TOP2·2025-12-10 06:50

Core Insights - The core viewpoint emphasizes that the key to successful autonomous driving lies in the integration of the VLA model with the entire embodied intelligence system, where data plays a crucial role in determining effectiveness [1][4]. Summary by Sections VLA Model - The VLA is fundamentally a generative model, utilizing a GPT-like approach for autonomous driving, generating trajectories and control signals instead of text. User feedback indicates that VLA exhibits emergent behaviors in certain scenarios, reflecting a growing understanding of the physical world [2]. - The world model is better suited for creating "test environments" rather than acting as "test subjects," due to its high computational demands. Ideal is currently leveraging cloud-based data generation and realistic simulation testing, utilizing several exaFLOPS of computational power for simulation tests, which cannot be matched by even the most powerful vehicle chips [2]. - Discussions about model architecture are less relevant than the actual performance outcomes. In autonomous driving, focusing on vast amounts of real data is essential, and Ideal's commitment to VLA is supported by a data loop created from millions of vehicles, enabling near-human driving levels with current computational resources [2]. Embodied Intelligence - To excel in autonomous driving, it is essential to treat it as a complete embodied intelligence system, where all components must work together during development to maximize value. Human drivers do not require extraordinary abilities; rather, coordination among various parts is crucial [3]. - The embodied intelligence system comprises perception (eyes), models (brain), operating systems (nervous system), chips (heart), and the body (vehicle). Full-stack self-research is necessary, encompassing both software and hardware. Ideal's autonomous driving team collaborates with foundational model, chip, and chassis teams to create a comprehensive autonomous driving system [3]. Data Utilization - The key to effective modeling is its compatibility with the entire embodied intelligence system, with data being the decisive factor. While data acquisition is challenging in robotics, it is not a significant issue for companies in the autonomous driving sector that have established data loops. Ideal can mine and filter from over 1 billion kilometers of accumulated data and continuously gather new data from 1.5 million vehicle owners [4]. - During data filtering, interesting patterns were observed, such as nearly 40% of human driving data showing a tendency to drive on one side and not strictly adhering to speed limits. This behavior aligns with typical human driving patterns, leading to the decision not to eliminate these data samples. The VLA model is expected to serve both current and future automotive forms of embodied robots [4].