理想VLA
Search documents
理想提出首个包含自车和他车轨迹的世界模型
理想TOP2· 2025-11-23 11:56
理想的世界模型包含自车和其他车的轨迹,这是理想首次提出的。 做这件事目的是为了能够让理想VLA在仿真环境里进行强化学习,同一个场景可以不断测试更优的轨迹路线,这是真实数据完全无法实现的。 可视化见下面这个视频: 理想VLA训练过程: 预训练阶段是在云端训一个32B的VL基座模型,包含3D视觉、比开源模型清晰度提升3-5倍的高清2D视觉、驾驶相关的language的语料,关键的 VL联合语料(如导航信息与人类判断的同步记录),为适配车端算力并保证推理速度,云端大模型蒸馏成3.2B的MoE模型。 后训练阶段是将action引入模型,使其转化为VLA,参数量接近4B,采用短链条CoT,限制在2-3步以内,再用difusion,对未来4-8秒的轨迹和 环境进行预测。 强化学习阶段为两部分,一是人类反馈强化学习,二是不依赖人类反馈,利用世界模型模型生成数据进行纯强化学习训练,基于舒适性(G值)、 无碰撞、遵守交规三大指标自我进化,目标是驾驶水平超越人类。 2025年3月12日理想发布 Other Vehicle Trajectories Are Also Needed: A Driving World Model Un ...
基于准确的原始材料对比小鹏理想VLA
理想TOP2· 2025-11-20 10:42
Core Viewpoint - The article discusses the advancements in autonomous driving technology, particularly focusing on the VLA (Vision-Language-Action) architecture developed by Li Auto and the insights shared by Xiaopeng's autonomous driving head, Liu Xianming, during a podcast. Liu emphasizes the removal of the intermediate language component (L) to enhance scalability and efficiency in data usage [1][4][5]. Summary by Sections VLA Architecture and Training Process - The VLA architecture involves a pre-training phase using a 32 billion parameter (32B) vision-language model that incorporates 3D vision and high-definition 2D vision, improving clarity by 3-5 times compared to open-source models. It also includes driving-related language data and key VL joint data [10][11]. - The model is distilled into a 3.2 billion parameter (3.2B) MoE model to ensure fast inference on vehicle hardware, followed by a post-training phase that integrates action to form the VLA, increasing the parameter count to nearly 4 billion [13][12]. - The reinforcement learning phase consists of two parts: human feedback reinforcement learning (RLHF) and pure reinforcement learning using world model-generated data, focusing on comfort, collision avoidance, and adherence to traffic regulations [15][16]. Data Utilization and Efficiency - Liu argues that using language as a supervisory signal can introduce human biases, reducing data efficiency and scalability. The most challenging data to collect are corner cases, which are crucial for training [4][6]. - The architecture aims to achieve a high level of generalization, with plans to implement L4 robotaxi services in Guangzhou based on the current framework [4][5]. Future Directions and Challenges - Liu acknowledges the uncertainties in scaling the technology and ensuring safety, questioning how to maintain safety standards and align the model with human behavior [5][18]. - The conversation highlights that the VLA, VLM, and world model are fundamentally end-to-end architectures, with various companies working on similar concepts in the realm of Physical AI [5][18]. Human-Agent Interaction - The driver agent is designed to process short commands directly, while complex instructions are sent to the cloud for processing before execution. This approach allows the system to understand and interact with the physical world like a human driver [17][18]. - The article concludes that the traffic domain is a suitable environment for VLA implementation due to its defined rules and the ability to model human driving behavior effectively [19][20].
关于理想VLA未来发展的一些信息
自动驾驶之心· 2025-11-10 03:36
Core Viewpoint - The article discusses the future of Li Auto's VLA (Vehicle Learning Architecture), emphasizing the development of a reinforcement learning closed loop by the end of 2025, which is expected to significantly enhance user experience and vehicle performance [2][3]. Short-term Outlook - Li Auto aims to establish a reinforcement learning closed loop by the end of 2025, with expectations of noticeable improvements in vehicle performance and user perception by early 2026 [2]. Mid-term Outlook - After strengthening the reinforcement learning closed loop, Li Auto anticipates surpassing Tesla in the Chinese market due to its unique advantages in closed-loop iteration [3]. - The transformation brought by VLA's reinforcement learning is seen as a significant business change, creating a true competitive moat for the company, which will take 1-2 years to fully implement [3]. Long-term Outlook - VLA is projected to achieve Level 4 autonomy, but new technologies are expected to emerge beyond this [4]. - Current safety restrictions are in place to mitigate risks, with the system designed to autonomously identify and address issues through data collection and training [4]. Key Insights on VLA - Li Auto's leadership believes that the intelligence required for driving is relatively low, and after business process reforms, the computational needs for vehicle performance will not be excessively high [5][6]. - The company is focusing on a balanced computational requirement of around 1000 to 2000 TOPS for vehicles and 32 billion for cloud processing [6]. Organizational Adjustments - Li Auto's autonomous driving department is undergoing structural changes to enhance its business system rather than relying on individual talents, with a focus on AI-oriented organization [12]. - The restructuring includes splitting existing teams into specialized departments to improve efficiency and innovation [12]. Competitive Landscape - Li Auto's approach to VLA has faced skepticism from competitors, but the company views this as validation of its strategy [14]. - The article highlights the importance of data quality and distribution in achieving effective autonomous driving, emphasizing the need for human-like reasoning capabilities in systems [18]. Strategic Focus - The company is committed to delivering substantial functional upgrades and user experience improvements on a quarterly basis [18]. - Li Auto's leadership emphasizes the importance of clear communication of company strategy to engage younger employees effectively [18].
郎咸鹏给理想VLA新画的4个饼以及值得留意的5点
理想TOP2· 2025-11-04 13:33
Core Viewpoint - The article discusses the future of Li Auto's VLA technology, emphasizing the importance of a reinforced learning loop and the potential for significant advancements in autonomous driving capabilities by 2027 [1][2]. Short-term Outlook - Li Auto aims to establish a reinforced learning loop by the end of 2025, which is expected to enhance user experience significantly, making the vehicle feel more "alive" and responsive [1]. Mid-term Outlook - With the reinforced learning loop in place, Li Auto anticipates surpassing Tesla in the Chinese market due to its advantageous environment for iterative improvements [1]. Long-term Outlook - The VLA technology is projected to achieve Level 4 autonomy, with the expectation of new technologies emerging beyond this milestone [1]. Business Process Transformation - The transition to reinforced learning is not just a technical change but a fundamental business transformation that will create a competitive moat for the company [1][3]. Team Dynamics and Leadership - The restructuring of the autonomous driving team focuses on building a robust business system rather than relying on individual talents, with an emphasis on internal talent development [7][8]. AI and Computational Needs - The current intelligence requirements for driving are considered low, and after the business process reform, clearer insights into computational needs will emerge [3][4]. Competitive Landscape - The article suggests that multiple players will exist in the autonomous driving space, and the narrative of having unique capabilities may not constitute a strict competitive moat [2][8]. Data and Model Development - The importance of data quality and distribution in training models is highlighted, with a focus on addressing corner cases to enhance system performance [9]. Strategic Insights - Li Auto's strategy emphasizes the need for substantial resource allocation and continuous investment in AI technology, akin to the role of Elon Musk at Tesla [8][12]. Organizational Structure - The restructuring of the autonomous driving department includes the formation of various specialized teams to enhance operational efficiency and employee engagement [7][11]. Future Projections - By 2027, the industry may shift away from traditional metrics like MPI, indicating a potential evolution in performance evaluation standards [11].
和一些人交流后, 更深入的分析地平线HSD
自动驾驶之心· 2025-11-04 00:03
Core Viewpoints - The article presents eight key viewpoints regarding the performance and evaluation of autonomous driving technologies, particularly focusing on the comparison between Horizon's HSD and Li Auto's VLA systems [3]. Group 1: Performance Evaluation - The experience with Horizon's HSD during a 1.5-hour test drive was notably better than the current production version of Li Auto's L7 VLA, although future production versions may not match the engineering version's performance [3][5]. - The evaluation of HSD's performance is limited due to the lack of comprehensive safety assessments and the variability of experiences across different locations [3][7]. - The HSD system demonstrated good vertical control, but its performance can vary significantly based on the city and driving conditions [6][7]. Group 2: Technical Comparisons - Horizon employs a VA-style end-to-end approach, while Li Auto utilizes a VLA-style end-to-end system, with the naming being a mere distinction [9][10]. - The VA-style end-to-end system is perceived to have advantages in user experience due to current limitations in computing power and bandwidth faced by the VLA approach [6][12]. - Li Auto's decision to pursue VLA for mass production is seen as a bold move, but it comes with challenges related to resource allocation and the need for higher computational requirements [11][12]. Group 3: Industry Outlook - There is a prevailing belief that many autonomous driving operators will eventually converge in capabilities, with only a few manufacturers able to survive without in-house development of autonomous driving technologies [3][11]. - The article suggests that manufacturers lacking self-research capabilities in autonomous driving may struggle to adapt to the evolving smart vehicle industry [3][11]. - The future landscape of autonomous driving will likely see a concentration of capabilities, with differentiation becoming increasingly important as the industry matures [3][11].
和一些人交流后, 更深入的分析地平线HSD与理想VLA
理想TOP2· 2025-11-02 09:08
Core Viewpoints - The article presents eight key viewpoints regarding the performance and evaluation of autonomous driving technologies, particularly focusing on the experiences with Horizon's HSD and Li Auto's VLA systems [2]. Group 1: Performance Evaluation - TOP2 found the Horizon HSD software experience during a 1.5-hour test drive in Hangzhou to be significantly better than the current production version of Li Auto's L7 VLA [2]. - There is a possibility that the production version of Horizon's software may not perform as well as the engineering version experienced during the test [2]. - The evaluation of autonomous driving systems is limited by the number of test experiences, as a few tests cannot generalize performance across different regions [3]. Group 2: Technical Architecture - Horizon employs a VA-style end-to-end system, while Li Auto uses a VLA-style end-to-end system, with the naming being a minor distinction [3][9]. - The current technological landscape suggests that VA-style systems may have advantages in user experience due to existing computational and bandwidth limitations [6]. - Li Auto's decision to adopt a VLA-style system is seen as a courageous move, as it requires significant resources and presents various challenges [14]. Group 3: Market Dynamics - The future landscape of autonomous driving operators is uncertain, with a prevailing belief that only a few companies will survive, particularly those capable of self-developing their technologies [4]. - Companies lacking self-research capabilities in autonomous driving may struggle to adapt to the evolving smart vehicle industry [4]. - The article emphasizes that autonomous driving is not merely a selling point but a differentiating capability that can lead to high market concentration due to low marginal costs [4]. Group 4: User Experience Insights - Feedback from Horizon personnel indicated that the performance of their systems in extreme weather and complex scenarios is generally average, highlighting the need for comprehensive testing [5][6]. - The experiences reported during the test drives varied significantly based on the vehicle models and their respective chip capabilities, indicating that performance can be inconsistent [7]. - The article suggests that the perception of Horizon's HSD performance may be overly positive due to selective testing locations and conditions [8].
地平线HSD的确值得留意
自动驾驶之心· 2025-10-29 03:30
Core Insights - The article discusses the advancements in autonomous driving technology, particularly focusing on the performance of Horizon's HSD system compared to Li Auto's VLA system, highlighting the strengths and weaknesses of both [5][6]. Group 1: Technology Comparison - Horizon's HSD technology architecture utilizes visual information for trajectory output, with laser radar positioning as a safety redundancy, while the VLA system is criticized for its high computational and bandwidth requirements [5]. - During a test drive of the Horizon HSD engineering vehicle, the experience was reported to be significantly better than the current production version of Li Auto's VLA, particularly in terms of comfort and smoothness during traffic conditions [6]. - Feedback from the Horizon team indicated that the HSD system performs well in controlled environments but has limitations in extreme weather and complex scenarios, suggesting a need for further development [7]. Group 2: Community and Collaboration - The article mentions the establishment of nearly a hundred technical discussion groups related to various aspects of autonomous driving, with a community of around 4,000 members and over 300 companies and research institutions involved [8]. - The collaboration between Horizon and vehicle manufacturers is emphasized, with a focus on integrating user interface elements that respect manufacturer preferences, which can impact the overall driving experience [7]. Group 3: Future Outlook - The article suggests that while the HSD system shows promise, it is still in development and may not yet reach full autonomous driving capabilities, estimating it to be around 60% of the level of Li Auto's V13 system [7].
地平线HSD的确值得理想留意
理想TOP2· 2025-10-27 13:50
Core Viewpoint - The article discusses the comparative performance of Horizon's HSD technology and Li Auto's VLA system, highlighting the strengths and weaknesses of both in terms of autonomous driving capabilities and user experience [1][2]. Group 1: Performance Comparison - Horizon's HSD engineering vehicle demonstrated superior auxiliary driving capabilities compared to Li Auto's L7 VLA as of October 2025, although there is a possibility that mass production vehicles may not perform as well as engineering prototypes [1]. - During a 1.5-hour test drive around West Lake in Hangzhou, the HSD vehicle showed high levels of comfort and smoothness, with no need for manual speed adjustments, contrasting with the frequent adjustments required in the Li Auto VLA [2]. - Feedback from multiple testers indicated that the A model of Horizon's HSD performed well, while the B model was considered average, attributed to differences in chip computing power and collaboration between the two companies [2]. Group 2: Limitations and Challenges - Horizon's team acknowledged that the HSD system performs poorly in extreme weather, non-standard scenarios, and complex situations, indicating that it is not yet fully reliable for autonomous driving [3]. - The team also noted that the transition from auxiliary driving to full autonomy can sometimes lead to subpar experiences, particularly in scenarios requiring navigation adjustments [3]. - The integration of HUD and vehicle interfaces is crucial for the overall driving experience, with some design choices being counterintuitive, which could affect user satisfaction [3]. Group 3: Community Engagement - There is an invitation for deeper discussions regarding Li Auto's operational status and long-term fundamentals, emphasizing a focus on practical business insights rather than technical discussions [4].
AI应用公司负责人分享对理想VLA的理解
理想TOP2· 2025-09-13 11:50
Core Viewpoint - The core value of VLA (Vehicle Learning Assistant) lies in its ability to effectively utilize data for training foundational models and personal memory, enhancing user experience through self-evolution without the need for OTA updates [2][5][6]. Group 1: VLA Functionality - VLA's memory function captures various driving habits and preferences, allowing for a personalized driving experience that evolves over time [2][12]. - The system operates by tokenizing and summarizing collected data, which is then utilized to enhance the driving experience [10][13]. - Users are encouraged to actively engage with VLA by driving frequently to improve its performance and adaptability [8]. Group 2: Strategic Insights - The strategy involves a decentralized approach to personal memory data, AI infrastructure, and hardware integration, positioning the company to leverage user data effectively [6][20]. - The focus is on creating a unified experience across various devices, similar to Apple's ecosystem, which enhances user reliance on the brand [20][25]. - The importance of foundational model capabilities and the need for proprietary chip development to support advanced AI functionalities are emphasized [22][23]. Group 3: Market Positioning - The company is currently leading in the development of VLA and its memory capabilities, with competitors like Huawei and Horizon still in the early stages [15][19]. - The concept of "persistent memory" is highlighted as a key investment theme, enabling AI to evolve from a one-time tool to a reliable long-term partner [16][25]. - The integration of personalized memory with AI models is seen as a significant challenge but essential for creating customized driving experiences [25].
关于理想VLA新的36个QA
理想TOP2· 2025-08-13 05:10
Core Viewpoint - The article discusses the advancements and challenges in the development of the VLA (Visual-Language-Action) model for autonomous driving, emphasizing the importance of reinforcement learning and the integration of 3D spatial understanding with global semantic comprehension. Group 1: VLA Model Development - The VLA model incorporates reinforcement learning, which is crucial for its development and performance [1] - The integration of 3D spatial understanding and global semantic comprehension enhances the model's capabilities compared to previous versions [7] - The transition from VLM (Visual-Language Model) to VLA involves a shift from parallel to a more integrated architecture, allowing for deeper cognitive processing [3][4] Group 2: Technical Challenges - The deployment of the VLA model faces challenges such as multi-modal alignment, data training difficulties, and the complexity of deploying on a single chip [8][9] - The model's performance is expected to improve significantly with advancements in chip technology and optimization techniques [9][10] - The need for extensive data labeling and the potential for overfitting in simulation data are highlighted as ongoing concerns [23][32] Group 3: Industry Comparisons - The article compares the gradual approach of the company in advancing from L2 to L4 autonomous driving with the rapid expansion strategies of competitors like Tesla [11] - The company aims to provide a more comprehensive driving experience by focusing on user needs and safety, rather than solely on technological capabilities [11][22] Group 4: Future Directions - The company plans to enhance the VLA model's capabilities through continuous iteration and integration of user feedback, aiming for a more personalized driving experience [35] - The importance of regulatory compliance and collaboration with government bodies in advancing autonomous driving technology is emphasized [17][18]