行程级仿真

Search documents
SceneDiffuser++:基于生成世界模型的城市规模交通仿真(CVPR'25)
自动驾驶之心· 2025-07-21 11:18
Core Viewpoint - The article discusses the development of SceneDiffuser++, a generative world model that enables city-scale traffic simulation, addressing the unique challenges of trip-level simulation compared to event-level simulation [1][2]. Group 1: Introduction and Background - The primary goal of traffic simulation is to supplement limited real-world driving data with extensive synthetic simulation mileage to support the testing and validation of autonomous driving systems [1]. - An ideal generative simulation city (CitySim) should seamlessly simulate a complete journey from point A to point B, managing dynamic elements such as vehicles, pedestrians, and traffic lights [1]. Group 2: Technical Integration - Achieving CitySim requires the integration of multiple technologies, including scene generation, agent behavior modeling, occlusion reasoning, dynamic scene generation, and environmental simulation [2]. - SceneDiffuser++ is the first end-to-end generative world model that consolidates these requirements through a single loss function, enabling complete simulation from A to B [2]. Group 3: Core Challenges and Innovations - Trip-level simulation faces three unique challenges compared to event-level simulation, including the need for dynamic agent management, occlusion reasoning, and environmental dynamics [3]. - SceneDiffuser++ introduces innovations such as multi-tensor diffusion, soft clipping strategies, and unified generative modeling to address these challenges [4][5]. Group 4: Methodology and Model Details - SceneDiffuser++ represents scenes as scene tensors, allowing the model to handle dynamic changes in heterogeneous elements like agents and traffic lights simultaneously [7]. - The model employs a diffusion process for training and inference, focusing on effective feature learning through loss masking and soft clipping to stabilize sparse tensor generation [8][9]. Group 5: Performance Evaluation - Experiments based on the WOMD-XLMap dataset demonstrate that SceneDiffuser++ outperforms previous models in all metrics, achieving lower Jensen-Shannon divergence values for agent generation and removal [12]. - The model maintains agent dynamics and traffic light realism over a 60-second simulation, contrasting with previous models that exhibited stagnation [15]. Group 6: Conclusion and Significance - The core contributions of SceneDiffuser++ include the introduction of the CitySim concept, the design of a unified generative framework, and the resolution of stability issues in dynamic scene generation through sparse tensor learning and soft clipping [19].