用QA问答详解端到端落地：[UniAD/PARA-Drive/SpareDrive/VADv2]

Core Viewpoint - The article discusses various end-to-end models in autonomous driving, focusing on their architectures and functionalities, particularly the UniAD framework and its modular components for perception, prediction, and planning [4][13]. Group 1: End-to-End Models - End-to-end models are categorized into two types: completely black-box models like OneNet, which optimize the planner directly, and modular end-to-end models that reduce error accumulation through interactions between perception, prediction, and planning modules [3]. - The UniAD framework consists of four main parts: multi-view camera input, backbone for BEV feature extraction, perception for scene-level understanding, and prediction for multi-mode trajectory forecasting [4]. Group 2: Specific Model Architectures - TrackFormer utilizes three types of queries: detection, tracking, and ego queries, with a dynamic length for the tracking query set based on object disappearance [6]. - MotionFormer operates similarly to RNN structures, processing sequential blocks to predict future states based on previous outputs, focusing on agent-level knowledge [9]. - MapFormer employs Panoptic Segformer for environment segmentation, distinguishing between countable instances and uncountable elements [10]. Group 3: Advanced Techniques - PARA-Drive modifies the UniAD framework by adjusting the connections between perception, prediction, and planning modules, allowing for parallel training and improved inference speed [13]. - Symmetric sparse perception is divided into two parallel parts for agent detection and map perception, utilizing a DETR paradigm for both tasks [20]. - The planning transformer integrates various tokens to output action probabilities, selecting the most probable action based on human trajectory data [23]. Group 4: Community and Learning Resources - The article highlights the establishment of numerous technical discussion groups related to autonomous driving, covering over 30 learning paths and involving nearly 300 companies and research institutions [27][28].