Workflow
监督赤字
icon
Search documents
解决特斯拉「监督稀疏」难题,DriveVLA-W0用世界模型放大自动驾驶Data Scaling Law
机器之心· 2025-11-17 04:23
Core Insights - The article discusses the transition of VLA models in autonomous driving from academic research to practical applications, highlighting the challenge of "supervision deficit" [2][5][8] - A new research paper titled "DriveVLA-W0: World Models Amplify Data Scaling Law in Autonomous Driving" proposes a solution to this challenge by introducing world models as a means to provide dense self-supervised signals [6][10][12] Group 1: Supervision Deficit - VLA models face a "supervision deficit" where high-dimensional visual input is paired with low-dimensional sparse supervisory signals, leading to wasted representational capacity [8][9] - The research team found that performance of VLA models saturates quickly with increased data under sparse action supervision, diminishing the effects of Data Scaling Law [9][22] Group 2: World Models as a Solution - The introduction of world models allows the model to predict future images, providing a richer and denser learning signal compared to relying solely on sparse actions [11][15][16] - This approach fundamentally alleviates the supervision deficit issue, enabling better learning of complex dynamics in driving environments [16][18] Group 3: Amplifying Data Scaling Law - The core contribution of the research is the discovery that world models significantly amplify the effects of Data Scaling Law, showing a steeper performance improvement with increased data compared to baseline models [18][21] - In experiments with up to 70 million frames, the world model reduced collision rates by 20.4%, demonstrating a qualitative leap in performance that surpasses merely stacking action data [24] Group 4: Efficiency and Real-World Application - The research also addresses the high latency issue in VLA models by proposing a lightweight MoE "action expert" architecture, which reduces inference latency to 63.1% of the baseline VLA without sacrificing performance [26][27] - This design enhances the feasibility of real-time deployment of VLA models in autonomous driving applications [27][29]