端到端自动驾驶

Search documents
某新势力智驾团队最后一位留守高管已于近日离职
自动驾驶之心· 2025-08-23 16:03
点击下方 卡片 ,关注" 自动驾驶之心 "公众号 戳我-> 领取 自动驾驶近30个 方向 学习 路线 从多方渠道获悉,某头部新势力车企智能驾驶团队的量产研发负责人W,已于本周五离职。。。 之前因为另一位团队负责人的离职,网传 负责人W和智驾一号位 两人进入蜜月期,飞书签名甚至改成一致, 没想到这么快就有变动~ W原为团队"三驾马车"核心架构中最后留守的高管,为该新势力车企在2024年智驾"跨越式超车"立下了汗马功 劳。 就在今年6月,W负责人曾带领团队进行了一段时间的封闭开发。该团队也是整个智驾团队的第二大团队,规 模近250人。去年该新势力的智驾团队已经进行过一轮大规模人员优化,当时负责人W带领的研发团队也是整 个智驾团队里影响最大的。 该新势力车企在智驾水平上的超车,离不开较为激进的技术策略,即率先在端到端上重投入。两年内扩招千余 人团队,不计成本地推进端到端量产,其量产团队的人才也成为国内各家智驾企业重点挖掘的目标对象。 今年以来,该新势力智驾团队已经出现大规模离职的态势,有的团队甚至超过一半的流失率,为了防止人才进 一步流失,该新势力也被迫开启全员竞业,甚至一两年的校招生也需要签署竞业协议。 前段时间 ...
面向量产VLA!FastDriveVLA:即插即用剪枝模块,推理加速近4倍
自动驾驶之心· 2025-08-23 16:03
点击下方 卡片 ,关注" 自动驾驶之心 "公众号 戳我-> 领取 自动驾驶近30个 方向 学习 路线 今天自动驾驶之心为大家分享 北京大学,小鹏汽车 最新的工作! FastDriveVLA:对抗性视觉token剪枝,50%压缩率下性能保持97.3%! 如果 您有相关工作需要分享,请在文末联系我们! 自动驾驶课程学习与 技术交流群加入 ,也欢迎添加小助理微信AIDriver005 >>自动驾驶前沿信息获取 → 自动驾驶之心知识星球 论文作者 | Jiajun Cao等 编辑 | 自动驾驶之心 写在前面 && 笔者理解 近年来,端到端自动驾驶研究进展神速,各家也都在如火如荼的宣传自家的端到端方案。与传统模块化方案(感知→预测→规划)不同,端到端方法在同一个模 型中完成全部感知到规划的过程,有效减少了不同模块之间的信息损失,也从某种角度简化了系统架构。但是技术的进步不止于此,随着视觉-语言大模型 (VLM)在视觉问答任务中展现出令人称奇的推理能力,很多研究人员及算法团队开始将其扩展至具身智能和自动驾驶领域,通过引入动作生成功能,形成了视 觉-语言-动作(VLA)模型。相较传统模块化方案,VLA 模型在复杂场景理解与 ...
又帮到了一位同学拿到了自动驾驶算法岗......
自动驾驶之心· 2025-08-23 14:44
最近有个开学即将研三的同学找柱哥诉苦,同门都在转具身智能,或者打算主攻大模型、Agent之类的互联网的 大厂,自己还在搞自动驾驶算法。从去年开始,行业就开始出现诸多裁员的消息,明年秋招了有些迷茫。。。最 开始大家都是做感知相关的,慢慢开始有些区别了。想问下自己是继续投身智驾行业,还是考虑转行。 这两天 才刚看到我们自动驾驶之心的社区,体系很完整,就怕有些晚了。 "什么时候都不算太晚。" 况且你还有时间聚焦在一些技术壁垒更高的方向,像VLA或者端到端后面转大模型或 者具身也更容易,不用太担心。尽快把自己的技术栈扩展和打牢才是重中之重。 如果你没有较强独立学习和搜 索问题的能力,可以来我们的自驾社区,也是目前国内最大最全的自驾学习平台【自动驾驶之心】知识星球。 "自动驾驶之心知识星球"目前集视频 + 图文 + 学习路线 + 问答 + 求职交流为一体,是一个综合类的自驾社区, 超过4000人了。 我们期望未来2年内做到近万人的规模。给大家打造一个交流+技术分享的聚集地,是许多初学 者和进阶的同学经常逛的地方。 社区内部还经常为大家解答各类实用问题:端到端如何入门?自动驾驶多模态大模型如何学习?自动驾驶VLA 的学习 ...
VLA方向的论文还不知怎么下手?有的同学已经CCF-A了......
自动驾驶之心· 2025-08-22 12:00
理想VLA司机大模型已经上车了!从发布会上看,VLA 能力的提升集中体现在三点:更懂语义 (多模态输入)、更擅长推理(思维链)、更接近人类驾驶直觉(轨迹规划)。发布会上展示了 四个核心能力:空间理解能力、思维能力、沟通与记忆能力以及行为能力。 ⼀、VLA科研论文辅导课题来啦⭐ 其中思维能力、沟通与记忆能力是语言模型赋予的能力,其中记忆能力还用到了RAG。下面是理 想VLA司机大模型思维链输出的demo:结合了动态目标、静态元素、导航地图、空间理解等等元 素。毫无疑问,VLA已经是自动驾驶学术界和工业界最为关注的方向。 而VLA是从VLM+E2E一路发展过来的,涵盖了端到端、轨迹预测、视觉语言模型、强化学习等多 个前沿技术栈。。而传统的BEV感知、车道线、Occupancy等工作相对较少出现在顶会了,最近也 有很多同学陆续来咨询柱哥,传统的感知、规划这块还能继续发论文吗?感觉工作都已经被做的 七七八八了,审稿人会打高分吗? 说到传统的感知、规划等任务,工业界都还在继续优化方案!但学术界基本都慢慢转向大模型与 VLA了,这个领域还有很多工作可以做的子领域... 之前我们已经开展了第一期VLA论文指导班,反响很不错 ...
端到端全新范式!复旦VeteranAD:"感知即规划"刷新开闭环SOTA,超越DiffusionDrive~
自动驾驶之心· 2025-08-21 23:34
点击下方 卡片 ,关注" 自动驾驶之心 "公众号 戳我-> 领取 自动驾驶近30个 方向 学习 路线 今天自动驾驶之心为大家分享复旦大学和上海创新研究院最新的工作 - VeteranAD ! 从"感知–规划"到"感知即规划"的端到端全新范 式。 如果您有相关工作需要分享,请在文末联系我们! 自动驾驶课程学习与技术交流群加入,也欢迎添加小助理微信AIDriver005做进一步咨询 >>自动驾驶前沿信息获取 → 自动驾驶之心知识星球 论文作者 | Bozhou Zhang等 编辑 | 自动驾驶之心 端到端自动驾驶在近几年取得了显著进展,它将多个任务统一到一个框架中,为了避免多个阶段造成的信息损失。通过这种方式,端到端驾驶框架也构建了一个完 全可微分的学习系统,能够实现面向规划的优化。这种设计使得其在 open-loop(开环) 和 closed-loop(闭环) 规划任务中都展现出了不错的表现。 主流的端到端自动驾驶方法通常采用顺序式范式:先执行感知,再执行规划,如图1(a)所示。常见的做法是引入 Transformer 架构,使整个流程保持可微分。然而, 仅仅依靠可微分性并不足以充分发挥端到端规划优化的优势。毕 ...
没有高效的技术和行业信息渠道,很多时间浪费了。。。
自动驾驶之心· 2025-08-21 23:34
点击下方 卡片 ,关注" 自动驾驶之心 "公众号 戳我-> 领取 自动驾驶近15个 方向 学习 路线 | 1 自动驾驶之心原创直播课程 | (5) 自动驾驶数据工程系列视频教程 | | --- | --- | | | 主要涉及自动标注、4D标注、数据处理、数据闭环等 | | (0) 综述汇总 | 学习链接: https://t.zsxq.com/0eXetSHAM | | 自动驾驶的里程碑调研PPT:https://t.zsxq.com/10LdsK3aw | | | 自动驾驶的里程碑调研视频:https://t.zsxq.com/10PkSHh9w | (6) 2D/3D目标跟踪系列视频教程 | | | 主要涉及2D/3D目标跟踪,多传感器融合等! | | (1) 感知融合系列视频教程 | 学习链接:https://t.zsxg.com/0eCgihNZR | | 主要涉及2D/3D检测、语义分割、点云分割、多模态、多传感器 | | | 学习链接: https://t.zsxq.com/0etfc3qsy | | | | (7) 自动驾驶仿真系列视频教程 | | (2) 多传感器标定系列视频教程 | 主要涉 ...
公司通知团队缩减,懂端到端的留下来了。。。
自动驾驶之心· 2025-08-19 23:32
Core Viewpoint - The article discusses the rapid evolution and challenges in the field of end-to-end autonomous driving technology, emphasizing the need for a comprehensive understanding of various algorithms and models to succeed in this competitive industry [2][4][6]. Group 1: Industry Trends - The shift from modular approaches to end-to-end systems in autonomous driving aims to eliminate cumulative errors between modules, marking a significant technological leap [2]. - The emergence of various algorithms and models, such as UniAD and BEV perception, indicates a growing focus on integrating multiple tasks into a unified framework [4][9]. - The demand for knowledge in multi-modal large models, reinforcement learning, and diffusion models is increasing, reflecting the industry's need for versatile skill sets [5][20]. Group 2: Learning Challenges - New entrants face difficulties due to the fragmented nature of knowledge and the overwhelming volume of research papers in the field, often leading to early abandonment of learning [5][6]. - The lack of high-quality documentation and practical guidance further complicates the transition from theory to practice in end-to-end autonomous driving research [5][6]. Group 3: Course Offerings - A new course titled "End-to-End and VLA Autonomous Driving" has been developed to address the learning challenges, focusing on practical applications and theoretical foundations [6][24]. - The course is structured to provide a comprehensive understanding of end-to-end algorithms, including their historical development and current trends [11][12]. - Practical components, such as real-world projects and assignments, are included to ensure that participants can apply their knowledge effectively [8][21]. Group 4: Course Content Overview - The course covers various topics, including the introduction to end-to-end algorithms, background knowledge on relevant technologies, and detailed explorations of both one-stage and two-stage end-to-end methods [11][12][13]. - Specific chapters focus on advanced topics like world models and diffusion models, which are crucial for understanding the latest advancements in autonomous driving [15][17][20]. - The final project involves practical applications of reinforcement learning from human feedback (RLHF), allowing participants to gain hands-on experience [21].
端到端VLA的起点:聊聊大语言模型和CLIP~
自动驾驶之心· 2025-08-19 07:20
Core Viewpoint - The article discusses the development and significance of end-to-end (E2E) algorithms in autonomous driving, emphasizing the integration of various advanced technologies such as large language models (LLMs), diffusion models, and reinforcement learning (RL) in enhancing the capabilities of autonomous systems [21][31]. Summary by Sections Section 1: Overview of End-to-End Autonomous Driving - The first chapter provides a comprehensive overview of the evolution of end-to-end algorithms, explaining the transition from modular approaches to end-to-end solutions, and discussing the advantages and challenges of different paradigms [40]. Section 2: Background Knowledge - The second chapter focuses on the technical stack associated with end-to-end systems, detailing the importance of LLMs, diffusion models, and reinforcement learning, which are crucial for understanding the future job market in this field [41][42]. Section 3: Two-Stage End-to-End Systems - The third chapter delves into two-stage end-to-end systems, exploring their emergence, advantages, and disadvantages, while also reviewing notable works in the field such as PLUTO and CarPlanner [42][43]. Section 4: One-Stage End-to-End and VLA - The fourth chapter highlights one-stage end-to-end systems, discussing various subfields including perception-based methods and the latest advancements in VLA (Vision-Language Alignment), which are pivotal for achieving the ultimate goals of autonomous driving [44][50]. Section 5: Practical Application and RLHF Fine-Tuning - The fifth chapter includes a major project focused on RLHF (Reinforcement Learning from Human Feedback) fine-tuning, providing practical insights into building pre-training and reinforcement learning modules, which are applicable to VLA-related algorithms [52]. Course Structure and Learning Outcomes - The course aims to equip participants with a solid understanding of end-to-end autonomous driving technologies, covering essential frameworks and methodologies, and preparing them for roles in the industry [56][57].
全面超越DiffusionDrive, GMF-Drive:全球首个Mamba端到端SOTA方案
理想TOP2· 2025-08-18 12:43
Core Insights - The article discusses the advancements in end-to-end autonomous driving, emphasizing the importance of multi-modal fusion architectures and the introduction of GMF-Drive as a new framework that improves upon existing methods [3][4][44]. Group 1: End-to-End Autonomous Driving - End-to-end autonomous driving has gained widespread acceptance as it directly maps raw sensor inputs to driving actions, reducing reliance on intermediate representations and information loss [3]. - Recent models like DiffusionDrive and GoalFlow demonstrate strong capabilities in generating diverse and high-quality driving trajectories [3]. Group 2: Multi-Modal Fusion Challenges - A key bottleneck in current systems is the integration of heterogeneous inputs from different sensors, with existing methods often relying on simple feature concatenation rather than structured information integration [4][6]. - The article highlights that current multi-modal fusion architectures, such as TransFuser, show limited performance improvements compared to single-modal architectures, indicating a need for more sophisticated integration methods [6]. Group 3: GMF-Drive Overview - GMF-Drive, developed by teams from University of Science and Technology of China and China University of Mining and Technology, includes three modules aimed at enhancing multi-modal fusion for autonomous driving [7]. - The framework combines a gated Mamba fusion approach with spatial-aware BEV representation, addressing the limitations of traditional transformer-based methods [7][44]. Group 4: Innovations in Data Representation - The article introduces a 14-dimensional pillar representation that retains critical 3D geometric features, enhancing the model's perception capabilities [16][19]. - This representation captures local surface geometry and height variations, allowing the model to differentiate between objects with similar point densities but different structures [19]. Group 5: GM-Fusion Module - The GM-Fusion module integrates multi-modal features through gated channel attention, BEV-SSM, and hierarchical deformable cross-attention, achieving linear complexity while maintaining long-range dependency modeling [19][20]. - The module's design allows for effective spatial dependency modeling and improved feature alignment between camera and LiDAR data [19][40]. Group 6: Experimental Results - GMF-Drive achieved a PDMS score of 88.9 on the NAVSIM benchmark, outperforming the previous best model, DiffusionDrive, by 0.8 points, demonstrating the effectiveness of the GM-Fusion architecture [29][30]. - The framework also showed significant improvements in key sub-metrics, such as driving area compliance and vehicle progression rate, indicating enhanced safety and efficiency [30][31]. Group 7: Conclusion - The article concludes that GMF-Drive represents a significant advancement in autonomous driving frameworks by effectively combining geometric representations with spatially aware fusion techniques, achieving new performance benchmarks [44].
自动驾驶现在关注哪些技术方向?应该如何入门?
自动驾驶之心· 2025-08-14 23:33
Core Viewpoint - The article emphasizes the establishment of a comprehensive community for autonomous driving, aiming to bridge communication between enterprises and academic institutions, while providing resources and support for individuals interested in the field [1][12]. Group 1: Community and Resources - The community has organized over 40 technical routes, offering resources for both beginners and advanced researchers in autonomous driving [1][13]. - Members include individuals from renowned universities and leading companies in the autonomous driving sector, fostering a collaborative environment for knowledge sharing [13][21]. - The community provides a complete entry-level technical stack and roadmap for newcomers, as well as valuable industry frameworks and project proposals for those already engaged in research [7][9]. Group 2: Learning and Development - The community offers a variety of learning routes, including perception, simulation, and planning control, to facilitate quick onboarding for newcomers and further development for those already familiar with the field [13][31]. - There are numerous open-source projects and datasets available, covering areas such as 3D object detection, BEV perception, and world models, which are essential for practical applications in autonomous driving [27][29][35]. Group 3: Job Opportunities and Networking - The community actively shares job postings and career opportunities, helping members connect with potential employers in the autonomous driving industry [11][18]. - Members can engage in discussions about career choices and research directions, receiving guidance from experienced professionals in the field [77][80]. Group 4: Technical Discussions and Innovations - The community hosts discussions on cutting-edge topics such as end-to-end driving, multi-modal models, and the integration of various technologies in autonomous systems [20][39][42]. - Regular live sessions with industry leaders are conducted, allowing members to gain insights into the latest advancements and practical applications in autonomous driving [76][80].