端到端自动驾驶模型的闭环仿真强化学习训练 - filings, earnings calls, financial reports, news

端到端自动驾驶模型的闭环仿真强化学习训练

Search documents

自动驾驶之心· 2025-08-14 11:12

Core Insights - The article discusses the challenges and advancements in end-to-end autonomous driving models, particularly focusing on closed-loop simulation reinforcement learning, which enhances robustness and adaptability through interaction with diverse environments [1] Group 1: Research Background and Core Challenges - Closed-loop reinforcement learning is gaining attention as it allows models to interact with environments, improving robustness and adaptability compared to imitation learning [1] - Two main challenges are identified: insufficient realism in simulation environments and uneven training data distribution, which limits model generalization [5][6] Group 2: Core Framework: ReconDreamer-RL - The ReconDreamer-RL framework integrates video diffusion priors and scene reconstruction, consisting of three core components that optimize autonomous driving strategies in two phases: imitation learning and reinforcement learning [3] Group 3: Components of ReconDreamer-RL - **ReconSimulator**: A high-fidelity simulation environment that combines appearance modeling and physics modeling to reduce the sim2real gap. It utilizes 3D Gaussian splatting for scene reconstruction and DriveRestorer for video artifact correction [4][7] - **Dynamic Adversary Agent (DAA)**: Generates extreme scenarios by controlling surrounding vehicle trajectories to create complex interactions like sudden lane changes and hard braking [8] - **Cousin Trajectory Generator (CTG)**: Enhances trajectory diversity by generating varied trajectories through trajectory extension and interpolation, addressing the bias towards simple linear movements in training data [10][12] Group 4: Experimental Validation: Performance and Advantages - The framework significantly reduces collision rates, achieving a collision rate of 0.077 compared to 0.386 for imitation learning methods and 0.238 for reinforcement learning methods, marking a reduction of approximately 5 times [16] - In extreme scenarios, the framework's collision rate drops to 0.053, showcasing a 404.5% improvement over traditional methods [18] - Ablation studies confirm the effectiveness of each component, with the removal of ReconSimulator leading to a collision rate increase from 0.077 to 0.238, highlighting the necessity of realistic simulation environments [20][22] Group 5: Rendering Efficiency - The rendering speed of ReconSimulator reaches 125 FPS, significantly surpassing other methods like EmerNeRF, which operates at 0.21 FPS, thus meeting the real-time interaction requirements for reinforcement learning [21]