VLA(视觉语言行动模型)
Search documents
中国智能驾驶产业的算力巨变
3 6 Ke· 2025-12-30 10:36
Core Insights - In 2025, the Chinese smart driving industry is experiencing an unprecedented shift in computing power, driven by the evolution of software algorithms and the emergence of competing technical paradigms [1][2] - The differentiation in high-level intelligent driving commercial applications is evident, with a K-shaped market split between affordable and high-end models, leading to fragmentation in the industry [2] - The demand for computing power is increasingly recognized as a core element in the development of smart driving technologies, both at the vehicle and cloud levels [2] Group 1: Technological Evolution - The transition to an end-to-end framework in smart driving is marked by significant advancements, as seen in Tesla's FSD Beta V12 software, which utilizes a computing power standard of 144 TOPS [3][4] - Tesla's shift from HW3 to HW4 signifies a major milestone in its autonomous driving evolution, with the latter becoming the preferred platform for future software updates [5][6] - The upcoming FSD V14 version is expected to have ten times the parameters of its predecessor, indicating a substantial leap in the vehicle's ability to process complex environmental information [6] Group 2: Market Dynamics - Chinese smart driving players, including Xpeng, Li Auto, and NIO, are adopting end-to-end strategies but are initially relying on existing computing platforms, primarily NVIDIA's Orin-X [7][12] - By 2025, a clear division among smart driving companies has emerged, categorized into three main factions based on their computing power strategies: self-developed chips, NVIDIA-based solutions, and Huawei's offerings [12][13] - The self-developed chip faction includes NIO's NX9031 and Xpeng's Turing AI chip, while the NVIDIA faction is represented by the latest Thor platform, which is gaining traction in various models [13][14] Group 3: Cloud Computing and Future Prospects - The industry is witnessing a race for cloud computing power, which is essential for the evolution of smart driving algorithms and the transition from L2 to L4 capabilities [19][20] - The reliance on cloud computing is becoming increasingly critical, as it supports data processing, model training, and simulation necessary for addressing complex driving scenarios [23][24] - The ongoing competition for cloud resources is expected to intensify, with companies recognizing that enhanced cloud capabilities are vital for future advancements in autonomous driving technology [20][21]
L4大方向有了:理想自动驾驶团队,在全球AI顶会上揭幕新范式
机器之心· 2025-10-31 04:11
Core Viewpoint - The article discusses the transition of AI into its "second half," emphasizing the need for new evaluation and configuration methods for AI to surpass human intelligence, particularly in the context of autonomous driving technology [1][5]. Group 1: AI Paradigm Shift - AI is moving from reliance on human-generated data to experience-based learning, as highlighted by Rich Sutton's paper "The Era of Experience" [1]. - OpenAI's former researcher, Yao Shunyu, asserts that AI must develop new evaluation methods to tackle real-world tasks effectively [1]. Group 2: Advancements in Autonomous Driving - At the ICCV 2025 conference, Li Auto's expert, Zhan Kun, presented a talk on evolving from data closed-loop to training closed-loop in autonomous driving [2][4]. - Li Auto introduced a systematic approach to integrate world models with reinforcement learning into mass-produced autonomous driving systems, marking a significant technological milestone [5]. Group 3: Li Auto's Technological Innovations - Li Auto's advanced driver assistance technology, LiAuto AD Max, is based on the Vision Language Action (VLA) model, showcasing a shift from rule-based algorithms to end-to-end solutions [7]. - The company has achieved significant improvements in its driver assistance capabilities, with a notable increase in the Human Takeover Mileage (MPI) over the past year [9]. Group 4: Challenges and Solutions in Data Utilization - Li Auto identified that the basic end-to-end learning approach faced diminishing returns as the training data expanded to 10 million clips, particularly due to sparse data in critical driving scenarios [11]. - The company aims to transition from a single data closed-loop to a more comprehensive training closed-loop, which includes data collection and iterative training through environmental feedback [12][14]. Group 5: World Model and Synthetic Data - Li Auto is developing a VLA vehicle model with prior knowledge and driving capabilities, supported by a cloud-based world model training environment that incorporates real, synthetic, and exploratory data [14]. - The ability to generate synthetic data has improved the training data distribution, enhancing the stability and generalization of Li Auto's driver assistance system [24]. Group 6: Research Contributions and Future Directions - Since 2021, Li Auto's research team has produced numerous papers, expanding their focus from perception tasks to advanced topics like VLM/VLA and world models [28]. - The company is addressing challenges in interactive intelligent agents and reinforcement learning engines, which are critical for the future of autonomous driving [35][38]. Group 7: Commitment to AI Development - Li Auto has committed nearly half of its R&D budget to AI, establishing multiple teams focused on various AI applications, including driver assistance and smart industrial solutions [43]. - The company has made significant strides in AI technology, with rapid iterations of its strategic AI products, including the VLA driver model launched with the Li Auto i8 [43].
理想汽车推送OTA 8.0版本,李想称公司辅助驾驶开始“全面领先”,VLA优于世界模型?
Mei Ri Jing Ji Xin Wen· 2025-09-12 10:06
Core Viewpoint - Li Auto's advanced driver assistance and smart cockpit have transitioned from "partially leading" to "fully leading" following the OTA 8.0 update of their vehicle system [1] Group 1: OTA 8.0 Update - The OTA 8.0 version has officially launched, enhancing driver assistance, smart cockpit, and smart electric features [3] - The new VLA (Vision-Language-Action Model) driver model is being fully pushed to Li MEGA and L series AD Max models [3] - Li Auto's chairman, Li Xiang, described VLA as the third generation of their driver assistance technology, emphasizing its ability to understand road conditions, comprehend human commands, and remember user habits [3] Group 2: VLA Model Features - The current version of VLA is referred to as a "crippled version" due to the temporary absence of a highly praised feature [4] - Li Auto has acknowledged the need for a cautious approach in rolling out new features, especially after the suspension of the VLA remote summon function [4] - The VLA model enhances the accuracy of route selection in complex scenarios and remembers user speed preferences for specific roads [6] Group 3: Industry Competition and Technology - Other companies like Yuanrong Qixing and XPeng Motors are also developing VLA models, indicating a competitive landscape in this technology [7] - The VLA model is seen as an "intelligent enhanced version" of end-to-end models, addressing challenges in handling unseen scenarios [8] - The VLA model integrates perception, action execution, and language processing, enhancing its ability to understand and make decisions in complex environments [8] Group 4: Differing Approaches - Huawei's approach focuses on the World Action model, which bypasses the language processing step, emphasizing direct control through vision [12] - The debate between VLA and world models highlights differing strategies in achieving advanced autonomous driving capabilities [12][13] - Experts suggest that both VLA and world models can coexist and complement each other, with different companies choosing paths based on their specific goals [13]