模仿学习无法真正端到端？

Core Viewpoint - The article emphasizes that in the autonomous driving industry, the training methods are more critical than model architectures like VLA or world models, highlighting the limitations of imitation learning in achieving true end-to-end autonomous driving [2][14]. Limitations of Imitation Learning - Imitation learning assumes that expert data is optimal, but in the context of driving, there is no single perfect driving behavior due to the diverse styles and strategies of human drivers [3][4]. - The training data lacks consistency and optimality, leading to models that learn vague and imprecise driving patterns rather than clear and logical strategies [3][4]. - Imitation learning fails to distinguish between critical decision-making scenarios and ordinary ones, resulting in models that may make fatal errors in crucial moments [5][6]. Key Scene Identification - The article discusses the importance of identifying key scenes in driving, where the model's output precision is critical, especially in complex scenarios [7][8]. - It introduces the concept of "advantage" from reinforcement learning, which helps define key states where optimal actions significantly outperform others [7]. Out-of-Distribution (OOD) Issues - Open-loop imitation learning can lead to cumulative errors, causing the model to enter states that differ from the training data distribution, resulting in performance degradation [8][10][12]. - The article illustrates that models trained purely on imitation learning may struggle in critical situations, such as timely lane changes, due to their reliance on suboptimal behaviors learned from human data [13]. Conclusion - The core of technological development lies in identifying key routes and bottlenecks rather than merely following trends, suggesting a need for new methods beyond imitation learning to address its limitations [14].