Workflow
Trajectory Planning
icon
Search documents
DiffusionDriveV2核心代码解析
自动驾驶之心· 2025-12-22 03:23
Core Viewpoint - The article discusses the DiffusionDrive model, which utilizes a truncated diffusion approach for end-to-end autonomous driving, emphasizing its architecture and the integration of reinforcement learning to enhance trajectory planning and safety [1]. Group 1: Model Architecture - DiffusionDriveV2 incorporates reinforcement learning constraints within a truncated diffusion modeling framework for autonomous driving [3]. - The model architecture includes environment encoding through bird's-eye view (BEV) features and vehicle status, facilitating effective data processing [5]. - The trajectory planning module employs multi-scale BEV features to enhance the model's ability to predict vehicle trajectories accurately [8]. Group 2: Trajectory Generation - The model generates trajectories by first clustering true future trajectories of the vehicle using K-Means to create anchors, which are then perturbed with Gaussian noise to simulate variations [12]. - The trajectory prediction process involves cross-attention mechanisms that integrate trajectory features with BEV features, enhancing the model's predictive capabilities [15][17]. - The final trajectory is derived from the predicted trajectory offsets combined with the original trajectory, ensuring continuity and coherence [22]. Group 3: Reinforcement Learning and Safety - The Intra-Anchor GRPO method is proposed to optimize strategies within specific behavioral intentions, enhancing safety and goal-oriented trajectory generation [27]. - A comprehensive scoring system evaluates generated trajectories based on safety, comfort, rule compliance, progress, and feasibility, ensuring robust performance in various driving scenarios [28]. - The model incorporates a modified advantage estimation approach to provide clear learning signals, penalizing trajectories that result in collisions [30]. Group 4: Noise and Exploration - The model introduces multiplicative noise to maintain trajectory smoothness, addressing the inherent scale inconsistencies between proximal and distal trajectory segments [33]. - This approach contrasts with additive noise, which can disrupt trajectory integrity, thereby improving the quality of exploration during training [35]. Group 5: Loss Function and Training - The total loss function combines reinforcement learning loss with imitation learning loss to prevent overfitting and ensure general driving capabilities [39]. - The trajectory recovery and classification confidence contribute to the overall loss, guiding the model towards accurate trajectory predictions [42].
自动驾驶论文速递!VLA、世界模型、强化学习、轨迹规划等......
自动驾驶之心· 2025-10-18 04:00
Core Insights - The article discusses advancements in autonomous driving technologies, highlighting various research contributions and their implications for the industry. Group 1: DriveVLA-W0 - The DriveVLA-W0 training paradigm enhances the generalization ability and data scalability of VLA models by using world modeling to predict future images, achieving 93.0 PDMS and 86.1 EPDMS on NAVSIM benchmarks [6][12] - A lightweight Mixture-of-Experts (MoE) architecture reduces inference latency to 63.1% of the baseline VLA, meeting real-time deployment needs [6][12] - The data scaling law amplification effect is validated, showing significant performance improvements as data volume increases, with a 28.8% reduction in ADE and a 15.9% decrease in collision rates when using 70M frames [6][12] Group 2: CoIRL-AD - The CoIRL-AD framework combines imitation learning and reinforcement learning within a latent world model, achieving an 18% reduction in collision rates on the nuScenes dataset and a PDMS score of 88.2 on the Navsim benchmark [13][16] - The framework integrates RL into an end-to-end autonomous driving model, addressing offline RL's scene expansion issues [13][16] - A decoupled dual-policy architecture facilitates structured interaction between imitation learning and reinforcement learning, enhancing knowledge transfer [13][16] Group 3: PAGS - The Priority-Adaptive Gaussian Splatting (PAGS) framework achieves high-quality real-time 3D reconstruction in dynamic driving scenarios, with a PSNR of 34.63 and SSIM of 0.933 on the Waymo dataset [23][29] - PAGS incorporates semantic-guided pruning and regularization to balance reconstruction fidelity and computational cost [23][29] - The framework demonstrates a rendering speed of 353 FPS with a training time of only 1 hour and 22 minutes, outperforming existing methods [23][29] Group 4: Flow Planner - The Flow Planner achieves a score of 90.43 on the nuPlan Val14 benchmark, marking the first learning-based method to surpass 90 without prior knowledge [34][40] - It introduces fine-grained trajectory tokenization to enhance local feature extraction while maintaining motion continuity [34][40] - The architecture employs adaptive layer normalization and scale-adaptive attention to filter redundant information and strengthen key interaction information extraction [34][40] Group 5: CymbaDiff - The CymbaDiff model defines a new task for sketch-based 3D outdoor semantic scene generation, achieving a FID of 40.74 on the Sketch-based SemanticKITTI dataset [44][47] - It introduces a large-scale benchmark dataset, SketchSem3D, for evaluating 3D semantic scene generation [44][47] - The model employs a Cylinder Mamba diffusion mechanism to enhance spatial coherence and local neighborhood relationships [44][47] Group 6: DriveCritic - The DriveCritic framework utilizes vision-language models for context-aware evaluation of autonomous driving, achieving a 76.0% accuracy in human preference alignment tasks [55][58] - It addresses limitations of existing evaluation metrics by focusing on context sensitivity and human alignment in nuanced driving scenarios [55][58] - The framework demonstrates superior performance compared to traditional metrics, providing a reliable solution for human-aligned evaluation in autonomous driving [55][58]
自动驾驶论文速递 | 端到端、分割、轨迹规划、仿真等~
自动驾驶之心· 2025-08-09 13:26
Core Insights - The article discusses advancements in autonomous driving technologies, highlighting various frameworks and their contributions to improving safety, efficiency, and robustness in real-world scenarios. Group 1: DRIVE Framework - The DRIVE framework proposed by Stanford University and Microsoft integrates dynamic rule inference and verified evaluation for constraint-aware autonomous driving, achieving a 0.0% soft constraint violation rate and enhancing trajectory smoothness and generalization capabilities [2][6]. Group 2: Hybrid Learning-Optimization Framework - A hybrid learning-optimization trajectory planning framework developed by Beijing Jiaotong University and Hainan University achieves a 97% success rate and real-time planning performance of 54 milliseconds in highway scenarios [11][12]. Group 3: RoboTron-Sim - The RoboTron-Sim framework, developed by Meituan and Sun Yat-sen University, enhances the robustness of autonomous driving in extreme scenarios, achieving a 51.3% reduction in collision rates and a 51.5% improvement in trajectory accuracy on the nuScenes test [18][20]. Group 4: SAV Framework - The SAV framework proposed by Anhui University achieves high-precision vehicle part segmentation with an 81.23% mean Intersection over Union (mIoU) on the VehicleSeg10K dataset, surpassing previous best methods by 4.33% [34][40].