因果混淆(Causal Confusion)
Search documents
RAD:通过3DGS结合强化学习的端到端自动驾驶
自动驾驶之心· 2025-10-31 00:06
Core Insights - The paper addresses challenges in deploying end-to-end autonomous driving (AD) algorithms in real-world scenarios, focusing on causal confusion and the open-loop gap [1][2] - It proposes a closed-loop reinforcement learning (RL) training paradigm based on 3D Gaussian Splatting (3DGS) technology to enhance the robustness of AD strategies [2][8] Summary by Sections Problem Statement - The paper identifies two main issues: causal confusion, where imitation learning (IL) captures correlations rather than causal relationships, and the open-loop gap, where IL strategies trained in an open-loop manner perform poorly in real-world closed-loop scenarios [1][2][6] Related Research - The paper references various fields related to the study, including dynamic scene reconstruction, end-to-end autonomous driving, and reinforcement learning, highlighting existing methods and their limitations [3][4][5][7] Proposed Solution - The proposed RAD framework integrates 3DGS technology with RL and IL, employing a three-stage training paradigm: perception pre-training, planning pre-training, and reinforced post-training [8][24] - It includes a specially designed safety-related reward function to guide the AD strategy in handling safety-critical events [11][24] Experimental Validation - The paper details extensive experiments, including data collection of 2000 hours of human expert driving demonstrations and the creation of 4305 high-collision-risk traffic clips for training and evaluation [15][24] - Nine key performance indicators (KPIs) are used to assess the AD strategy, including dynamic collision ratio (DCR) and static collision ratio (SCR) [12][15][24] Key Findings - The RAD framework outperforms existing IL methods, achieving a threefold reduction in collision rates (CR) and demonstrating superior performance in complex dynamic environments [9][12][24] - The optimal RL-IL ratio of 4:1 was found to balance safety and trajectory consistency effectively [12][15] Future Directions - The paper suggests further exploration in areas such as enhancing the interactivity of the 3DGS environment, improving rendering techniques, and expanding the application of RL [17][21][22][29]