理想披露了一些新的技术信息

Core Insights - The article discusses the advancements and challenges faced by Li Auto in the development of its autonomous driving technology, particularly focusing on the end-to-end model and VLA (Vision-Language-Action) integration [2][5][9]. Group 1: Model Performance and Data Utilization - The performance improvement of end-to-end models slows down after reaching a certain amount of training data, specifically after 10 million clips, where the model's MPI (Miles Per Interaction) only doubled in five months [5]. - To enhance model performance, Li Auto adjusted the training data mix, increasing the quantity of generated data, including corner cases, and implementing manual rules for safety and compliance in special scenarios [5][9]. Group 2: VLA Integration and Decision-Making - The introduction of VLA aims to enhance the decision-making capabilities of the end-to-end model, addressing issues such as illogical behavior, lack of deep thinking in decision-making, and insufficient preventive judgment based on scenarios [5][6]. - VLA incorporates spatial intelligence, linguistic intelligence, and action policy, allowing the model to understand and communicate spatial information effectively, and generate smooth driving trajectories using diffusion models [6][9]. Group 3: Simulation and Testing Efficiency - Li Auto upgraded its model evaluation methods by utilizing a world model for closed-loop simulation and testing, significantly reducing testing costs from 18.4 per kilometer to 0.53 per kilometer [9][11]. - The closed-loop training framework AD-R1 was introduced, allowing for efficient data management and reinforcement learning, with high-value data being processed through a series of steps back to the cloud platform [11][12]. Group 4: Computational Power and Resources - Li Auto's total computational power is 13 EFLOPS, with 3 EFLOPS dedicated to inference and 10 EFLOPS for training, utilizing 50,000 training and inference cards [13]. - The emphasis on inference power is crucial in the VLA era, as it is necessary for generating simulation training environments [13].