理想、小鹏重金押注VLA大模型！“天才”还是“傻瓜”？

Core Viewpoint - The article discusses the divergence in the autonomous driving technology paths among car manufacturers, specifically focusing on the VLA (Vision, Language, Action) model and the WA (World Model) approach, highlighting the advantages and challenges of the VLA model in the context of achieving higher levels of autonomous driving [4][5][16][87]. Group 1: VLA Model Overview - The VLA model was popularized after Tesla's end-to-end system was launched, leading to widespread adoption across the industry [3][4]. - Companies like Li Auto and Xpeng have adopted the VLA model, with Li Auto claiming to have transitioned from "partially leading" to "fully leading" in autonomous driving technology [7][8][10]. - The VLA model is based on the concept introduced by Google's DeepMind in July 2023, initially aimed at robotics, allowing machines to understand human language [25][28]. Group 2: Advantages of VLA Model - The VLA model improves the interpretability of autonomous systems, allowing engineers to correct errors by modifying the descriptive language generated from sensor data [50]. - It enhances system interaction capabilities, enabling users to issue commands in natural language, thus creating a more intuitive user experience [56][58]. - The VLA model has a higher potential for handling complex scenarios, as it can reason and make decisions based on a broader understanding of the environment [62][66]. Group 3: Challenges of VLA Model - Despite its advantages, the VLA model may not show immediate performance differences compared to traditional end-to-end systems, as it builds upon the existing architecture [73][74]. - The complexity of the VLA model increases the need for substantial computational power, making hardware capabilities critical for its effectiveness [80][84]. - Companies must invest significantly in both software and hardware to fully leverage the VLA model, which raises concerns about its feasibility in the short term [87].