SparseDrive
Search documents
在地平线搞自动驾驶的这三年
自动驾驶之心· 2025-11-24 00:03
作者 | candywisdom 编辑 | 自动驾驶之心 原文链接: https://zhuanlan.zhihu.com/p/1970953355355469364 点击下方 卡片 ,关注" 自动驾驶之心 "公众号 戳我-> 领取 自动驾驶近30个 方向 学习 路线 >>自动驾驶前沿信息获取 → 自动驾驶之心知识星球 本文只做学术分享,如有侵权,联系删文 从自动驾驶转到具身智能已经有一年的时间了,之前在自动驾驶上一系列工作和一些个人思考还一直没有好好的做个总结。(Ps: 虽然广义来说,自动驾驶属于具身智 能的子领域,但是现阶段二者所面临的问题和解决问题的具体方式还是存在较大差异,所以还是算是进入了一个转向了一个新的方向。) 可预期的短时间内,主要精力投入应该不会放在自动驾驶上了,但总觉得该给自动驾驶的这段经历留个记录。倒不是说这些工作多"惊天动地",反而有些是"关注度不 高但挺实在"的探索,它们可能没上过热搜,但个人认为其确确实实解决过实际问题,希望可以给做相关方向的朋友提供点参考。 1. 从目标检测开始逐步往端到端planning拓展,构建一个强有力的端侧policy; 2. 针对端到端模型的闭环评测和训 ...
在地平线搞自动驾驶的这三年
自动驾驶之心· 2025-11-11 00:00
Core Viewpoint - The article discusses the transition from autonomous driving to embodied intelligence, highlighting the differences in challenges and solutions between the two fields. It emphasizes the importance of documenting past experiences in autonomous driving, despite the focus shifting to embodied intelligence. Research Areas Summary - The main research areas include 3D fusion perception, trajectory prediction, end-to-end motion planning, sensor simulation, traffic flow simulation, and foundational models for intelligent driving. These areas are interconnected and aim to build a comprehensive autonomous driving algorithm system [2][5]. 1. Sparse4D Series: Multi-Sensor Fusion Perception Framework - The Sparse4D series aims to improve perception performance by utilizing sparse queries and projection sampling from multi-view images, avoiding the computational costs associated with BEV (Bird's Eye View) methods. Sparse4D v1 introduced deformable aggregation for sparse fusion, while v2 improved temporal fusion complexity from O(T) to O(1) [6][9]. Sparse4D v3 further enhanced detection and tracking capabilities, achieving top rankings in camera-only detection and tracking leaderboards [11][13]. 2. SparseDrive: End-to-End Planning Attempt - SparseDrive integrates online mapping and motion planning, achieving five tasks: detection, tracking, mapping, prediction, and planning. It raises concerns about the simplicity of its planning decoder and the need for closed-loop performance evaluation [13][15]. 3. EDA & UniMM: Trajectory Prediction and Traffic Flow Simulation - EDA (Evolving and Distinct Anchors) addresses the core issue of anchor and sample allocation in trajectory prediction, enhancing model convergence. UniMM unifies existing traffic simulation models and proposes a general algorithm framework, addressing key performance factors [16][20]. 4. DriveCamSim: Sensor Simulation - DriveCamSim focuses on creating a highly controllable sensor simulation system to evaluate autonomous driving models efficiently. It emphasizes the need for a simulation system that can accurately reflect model performance without relying solely on real-world testing [22][24]. 5. LATR: Foundational Model for Intelligent Driving - LATR aims to build a robust foundational model for intelligent driving using large datasets and parameters. It employs a masking strategy for unsupervised training and integrates multiple tasks into a unified framework, demonstrating effective performance [26][27]. Conclusion and Outlook - The seven modules collectively form the core link of the autonomous driving system, indicating a correct technological path. The article suggests that the future focus should be on efficient evaluation systems and the potential of reinforcement learning to enhance model performance [30][31].
端到端系列!SpareDrive:基于稀疏场景表示的端到端自动驾驶~
自动驾驶之心· 2025-06-23 11:34
Core Viewpoint - The article discusses the limitations of existing end-to-end methods in autonomous driving, particularly the computational intensity of BEV paradigms and the inefficiency of sequential prediction and planning approaches. It proposes a new Sparse paradigm that allows for parallel processing of prediction and planning tasks [2][5]. Group 1: SparseDrive Methodology - SparseDrive adopts the core ideas from the previous Horizon Sparse series, focusing on sparse scene representation for autonomous driving [3]. - The proposed method modifies the similarities between motion prediction and planning, introducing a hierarchical planning selection strategy [5]. - The architecture includes features such as symmetric sparse perception and a parallel motion planner [5]. Group 2: Training and Performance - The training loss function for SparseDrive is defined as a combination of detection, mapping, motion, planning, and depth losses [9]. - Performance comparisons show that SparseDrive-S achieves a mean Average Precision (mAP) of 0.418, while SparseDrive-B reaches 0.496, outperforming other methods like UniAD [11]. - In motion prediction and planning, SparseDrive-S and SparseDrive-B demonstrate significant improvements in metrics such as minADE and minFDE compared to traditional methods [18]. Group 3: Efficiency Comparison - SparseDrive exhibits superior training and inference efficiency, requiring only 15.2 GB of GPU memory and achieving 9.0 FPS during inference, compared to UniAD's 50.0 GB and 1.8 FPS [20]. - The method's reduced computational requirements make it more accessible for real-time applications in autonomous driving [20]. Group 4: Course and Learning Opportunities - The article promotes a course focused on end-to-end autonomous driving algorithms, covering foundational knowledge, practical implementations, and various algorithmic approaches [29][41]. - The course aims to equip participants with the skills necessary to understand and implement end-to-end solutions in the autonomous driving industry [54][56].