理想郎咸朋：VLA 加强化学习将成为车企真正的护城河

Core Viewpoint - The article discusses the evolution of Li Auto's autonomous driving technology, particularly focusing on the development and implementation of the VLA (Vision-Language-Action) model, which aims to enhance the driving experience by integrating multi-modal AI capabilities. The article highlights the challenges faced by the team, the strategic decisions made, and the competitive landscape in the autonomous driving sector [5][6][18]. Team Development and Structure - The Li Auto autonomous driving team has undergone significant changes since its inception in 2018, with three generations of core personnel. The recent restructuring aimed to create a flatter organization with 11 new departments, enhancing communication and decision-making efficiency [8][9][51]. - The team has shifted from a centralized, closed development model to a more open and collaborative approach, reflecting the need for agility in AI development [10][11]. Strategic Decisions - The decision to pursue the VLA model was driven by the recognition that simply following existing paths, such as those taken by competitors like Huawei and Tesla, would not suffice. The team aimed to create a new competitive edge through innovative technology [6][14][18]. - The VLA model is positioned as a significant advancement over previous methods, with the goal of achieving L4 level autonomous driving capabilities. The model emphasizes the importance of human-like reasoning and decision-making in driving [21][29]. Challenges and Criticism - The VLA model has faced skepticism from industry experts, with concerns about its feasibility and the technical challenges associated with multi-modal AI integration. Critics argue that the approach may be overly simplistic or "tricksy" compared to other methods [22][24]. - Despite the criticism, the team believes that the challenges presented by the VLA model are indicative of its potential correctness and innovation [24][25]. Future Outlook - The company aims to establish a robust reinforcement learning loop to enhance the VLA model's capabilities, with expectations of significant improvements in user experience by the end of 2023 and into 2024 [28][39]. - The long-term vision includes achieving L4 autonomous driving by 2027, with a focus on building a comprehensive data-driven ecosystem that supports continuous learning and adaptation [41][44].