在地平线搞自动驾驶的这三年

Core Viewpoint - The article discusses the transition from autonomous driving to embodied intelligence, highlighting the differences in challenges and solutions between the two fields. It emphasizes the importance of documenting past experiences in autonomous driving, despite the focus shifting to embodied intelligence. Research Areas Summary - The main research areas include 3D fusion perception, trajectory prediction, end-to-end motion planning, sensor simulation, traffic flow simulation, and foundational models for intelligent driving. These areas are interconnected and aim to build a comprehensive autonomous driving algorithm system [2][5]. 1. Sparse4D Series: Multi-Sensor Fusion Perception Framework - The Sparse4D series aims to improve perception performance by utilizing sparse queries and projection sampling from multi-view images, avoiding the computational costs associated with BEV (Bird's Eye View) methods. Sparse4D v1 introduced deformable aggregation for sparse fusion, while v2 improved temporal fusion complexity from O(T) to O(1) [6][9]. Sparse4D v3 further enhanced detection and tracking capabilities, achieving top rankings in camera-only detection and tracking leaderboards [11][13]. 2. SparseDrive: End-to-End Planning Attempt - SparseDrive integrates online mapping and motion planning, achieving five tasks: detection, tracking, mapping, prediction, and planning. It raises concerns about the simplicity of its planning decoder and the need for closed-loop performance evaluation [13][15]. 3. EDA & UniMM: Trajectory Prediction and Traffic Flow Simulation - EDA (Evolving and Distinct Anchors) addresses the core issue of anchor and sample allocation in trajectory prediction, enhancing model convergence. UniMM unifies existing traffic simulation models and proposes a general algorithm framework, addressing key performance factors [16][20]. 4. DriveCamSim: Sensor Simulation - DriveCamSim focuses on creating a highly controllable sensor simulation system to evaluate autonomous driving models efficiently. It emphasizes the need for a simulation system that can accurately reflect model performance without relying solely on real-world testing [22][24]. 5. LATR: Foundational Model for Intelligent Driving - LATR aims to build a robust foundational model for intelligent driving using large datasets and parameters. It employs a masking strategy for unsupervised training and integrates multiple tasks into a unified framework, demonstrating effective performance [26][27]. Conclusion and Outlook - The seven modules collectively form the core link of the autonomous driving system, indicating a correct technological path. The article suggests that the future focus should be on efficient evaluation systems and the potential of reinforcement learning to enhance model performance [30][31].