Workflow
端到端自动驾驶
icon
Search documents
轻舟智航最新GuideFlow:端到端轨迹规划新方案
自动驾驶之心· 2025-11-30 02:02
Core Insights - The article discusses the development of a new planning framework called GuideFlow, which addresses the challenges of trajectory generation in end-to-end autonomous driving by incorporating explicit constraints and enhancing model optimization capabilities [3][11][49] - GuideFlow integrates various conditional signals to guide the generation process, improving the robustness and safety of autonomous driving systems [11][49] Summary by Sections Background Review - End-to-end autonomous driving (E2E-AD) has emerged as an attractive alternative to traditional modular approaches, allowing for unified training through data [9] - Recent advancements have shifted from single-modal to multi-modal trajectory generation to better reflect inherent uncertainties in real driving scenarios [9][10] GuideFlow Framework - GuideFlow explicitly models the flow matching process to alleviate mode collapse issues and flexibly integrates multiple guiding signals [3][11] - The framework combines flow matching with Energy-Based Model (EBM) training to enhance the model's ability to meet physical constraints [3][11] Experimental Results - GuideFlow demonstrated superior performance on various benchmark datasets, achieving state-of-the-art (SOTA) results, particularly on the challenging NavSim dataset with an Extended PMD Score (EPDMS) of 43.0 [3][34][37] - The framework's collision rate was notably low, with an average of 0.07% on the NuScenes dataset, showcasing its safety capabilities [40][41] Contributions and Innovations - The article highlights three core strategies within GuideFlow: speed field constraints, flow state constraints, and EBM flow optimization, which collectively enhance trajectory feasibility and safety [11][28][31] - The integration of driving aggressiveness scoring allows for dynamic adjustments in trajectory styles during inference, further refining the model's adaptability [33][49] Conclusion - GuideFlow represents a significant advancement in trajectory planning for autonomous driving, effectively embedding safety constraints into the generation process and demonstrating robust performance across various datasets [49]
轻舟智航最新!GuideFlow:端到端轨迹规划新方案,超越一众SOTA......
自动驾驶之心· 2025-11-26 00:04
点击下方 卡片 ,关注" 自动驾驶之心 "公众号 戳我-> 领取 自动驾驶近30个 方向 学习 路线 >>自动驾驶前沿信息获取 → 自动驾驶之心知识星球 论文作者 | Lin Liu等 编辑 | 自动驾驶之心 今年学术界和工业界很大的精力都投入在Action的建模上,也就是自车轨迹的输出。先前的MLP只能输出单模 的轨迹,实际使用中无法满足下游不确定性的需求。所以从去年开始,我们看到了生成式的很多算法问世。 经过这一年的发展,生成式的算法进一步收敛到Diffusion和Flow matching两个方向上。 自动驾驶之心了解到 上半年有不少公司都在尝试将这两种方法落地量产,期间坎坷无需多言。 今天为大家分享的是一篇北交&轻舟智航等团队最新的工作,提出一种基于Constrained Flow Matching的新型规 划框架 GuideFlow ,整体效果还不错。 具体而言,GuideFlow显式建模流匹配过程,该过程本质上可缓解模态坍塌的问题,并能灵活融合多种条件信 号的引导。本文的核心贡献在于, 将显式约束直接嵌入流匹配生成过程 ,而非依赖隐式约束编码。关键创新 点在于, GuideFlow将流匹配与Ene ...
博雷顿董事长陈方明:围绕“智能”发展 叩开矿山“系统智能化”大门
Zheng Quan Ri Bao Wang· 2025-11-25 03:28
博雷顿无人驾驶CTO白富儒在接受《证券日报》记者采访时介绍,矿卡电动化后必然走向自动驾驶,而 且要企业自研自动驾驶技术,进而实现软硬件深度结合。最新推出的"9M145E无人驾驶专用矿卡"在零 部件、传感器、通信方式等方面有诸多行业首创,研发目的是提升车辆可靠性和出勤率。 以"9M145E无人驾驶专用矿卡"为例,车辆的运行数据、维护数据、能耗数据和异常数据都会实时进入 智慧调度系统,使运营风险、设备负荷和作业状况更容易被监测与预判。这种"数字化透明度"正在成为 矿区治理体系的新基础:它减少人为不确定性,降低管理盲区,并为矿企在对外披露安全与ESG信息时 提供可靠依据。 从前期的实践成果来看,相关技术在人工效率方面表现突出。陈方明介绍,"端到端自动驾驶的综合效 率接近人工,部分场景下甚至超过人工。同时,该技术能大幅减少矿山司机数量,目标是实现个位数人 员管理百台规模的自动驾驶车辆,显著降低矿山用工人数,大幅提升矿山整体运营效率。" 重塑行业产业链结构 自登陆资本市场以来,博雷顿科技股份公司(以下简称"博雷顿")发展步伐明显加快。日前,公司推出 新产品"9M145E无人驾驶专用矿卡",新产品以无人驾驶为起点重新构建 ...
留给端到端和VLA的转行时间,应该不多了......
自动驾驶之心· 2025-11-25 00:03
这几个月其实很多小伙伴联系柱哥咨询未来的建议,有工作两三年的也有硕士甚至本科生。他们在刚接触这个领域时,往往会遇到很多问题。从模块化的量产算 法发展到端到端,再到如今的VLA。核心算法涉及BEV感知、视觉语言模型VLM、扩散模型、强化学习、世界模型等等。通过学习端到端与VLA自动驾驶,可以 掌握学术界和工业界最前沿的技术方向。据现有行业的发展来看,端到端和VLA的岗位快要饱和,留下的窗口期没多久了...... 很多同学的咨询如何快速高效的入门端到端和VLA。因此自动驾驶之心联合了 工业界 和 学术界 的大佬开展了 《端到端与VLA自动驾驶小班课》 和 《自动驾驶 VLA和大模型实战课程》 ! 扫码报名!优惠名额仅剩6个 扫码报名!抢占课程名额 课程大纲 自动驾驶VLA与大模型实战课程 由学术界大佬带队! 这门课程聚焦在VLA领域,从VLM作为自动驾驶解释器开始,到模块化VLA、一体化VLA,再到当前主流的推理增强VLA。三大自动驾驶 VLA领域全面梳理, 非常适合刚接触大模型、VLA的同学。 课程也配套了详细的理论基础梳理,Vision/Language/Acition三大模块、强化学习、扩散模型等等基 础, ...
浙大一篇中稿AAAI'26的工作DiffRefiner:两阶段轨迹预测框架,创下NAVSIM新纪录!
自动驾驶之心· 2025-11-25 00:03
点击下方 卡片 ,关注" 自动驾驶之心 "公众号 戳我-> 领取 自动驾驶近30个 方向 学习 路线 >>自动驾驶前沿信息获取 → 自动驾驶之心知识星球 编辑 | 自动驾驶之心 论文作者 | Liuhan Yin等 与自动驾驶中预测自车固定候选轨迹集的判别式方法不同,扩散模型等生成式方法能够学习未来运动的潜在分布,实现更灵活的轨迹预测。然而由于这些方法通常依 赖于对人工设计的轨迹锚点或随机噪声进行去噪处理,其性能仍有较大提升空间。 浙江大学&纽劢的团队提出一种全新的两阶段轨迹预测框架DiffRefiner :第一阶段采用基于Transformer的proposal解码器,通过对传感器输入进行回归,利用预定义轨 迹锚点生成粗粒度轨迹预测;第二阶段引入扩散Refiner,对初始预测结果进行迭代去噪与优化。通过融合判别式轨迹proposal模块,本文为生成式精炼过程提供了强有 力的引导,显著提升了基于扩散模型的规划性能。此外,本文设计了细粒度去噪解码器以增强场景适应性,通过加强与周围环境的对齐,实现更精准的轨迹预测。实 验结果表明,DiffRefiner达到了当前最优性能:在NAVSIM v2数据集上达到87.4的 ...
自动驾驶三大技术路线:端到端、VLA、世界模型
自动驾驶之心· 2025-11-21 00:04
Overview - The article discusses the ongoing technological competition in the autonomous driving industry, focusing on different approaches to solving corner cases and enhancing safety and efficiency in driving systems [1][3]. Technological Approaches - There is a debate between two main technological routes: single-vehicle intelligence (VLA) and intelligent networking (VLM) [1]. - Major companies like Waymo utilize VLM, which allows AI to handle environmental understanding and reasoning, while traditional modules maintain decision-making control for safety [1]. - Companies such as Tesla, Geely, and XPeng are exploring VLA, aiming for AI to learn all driving skills through extensive data training for end-to-end decision-making [1]. Sensor and Algorithm Developments - The article highlights the evolution of perception technologies, with BEV (Bird's Eye View) perception becoming mainstream by 2022, and OCC (Occupancy) perception gaining traction in 2023 [3][5]. - BEV integrates various sensor data into a unified spatial representation, facilitating better path planning and dynamic information fusion [8][14]. - OCC perception provides detailed occupancy data, clarifying the probability of space being occupied over time, which enhances dynamic interaction modeling [6][14]. Modular and End-to-End Systems - Prior to the advent of multimodal large models and end-to-end autonomous driving technologies, perception and prediction tasks were typically handled by separate modules [5]. - The article outlines a phased approach to modularization, where perception, prediction, decision-making, and control are distinct yet interconnected [4][31]. - End-to-end systems aim to streamline the process by allowing direct mapping from raw sensor inputs to actionable outputs, enhancing efficiency and reducing bottlenecks [20][25]. VLA and VLM Frameworks - VLA (Visual-Language-Action) and VLM (Visual-Language Model) frameworks are discussed, with VLA focusing on understanding complex scenes and making autonomous decisions based on visual and language inputs [32][39]. - The article emphasizes the importance of language models in enhancing the interpretability and safety of autonomous driving systems, allowing for better cross-scenario knowledge transfer and decision-making [57]. Future Directions - The competition between VLA and WA (World Action) architectures is highlighted, with WA emphasizing direct visual-to-action mapping without language mediation [55][56]. - The article suggests that the future of autonomous driving will involve integrating world models that understand physical laws and temporal dynamics, addressing the limitations of current language models [34][54].
和港校自驾博士交流后的一些分享......
自动驾驶之心· 2025-11-20 00:05
Core Viewpoint - The article emphasizes the importance of building a comprehensive community for autonomous driving, providing resources, networking opportunities, and guidance for both newcomers and experienced professionals in the field [6][16][19]. Group 1: Community and Networking - The "Autonomous Driving Heart Knowledge Planet" community aims to create a platform for technical exchange and collaboration among members from renowned universities and leading companies in the autonomous driving sector [16][19]. - The community has grown to over 4,000 members and aims to reach nearly 10,000 within two years, facilitating discussions on technology trends and industry developments [6][7]. - Members can freely ask questions regarding career choices and research directions, receiving insights from industry experts [89][92]. Group 2: Learning Resources - The community offers a variety of learning materials, including video tutorials, technical routes, and Q&A sessions, covering over 40 technical directions in autonomous driving [9][11][16]. - Specific learning paths are provided for newcomers, including foundational courses and advanced topics in areas such as end-to-end driving, multi-sensor fusion, and 3D target detection [11][17][36]. - The community has compiled a comprehensive list of open-source projects and datasets relevant to autonomous driving, aiding members in their research and development efforts [32][34][36]. Group 3: Career Development - The community facilitates job referrals and connections with various autonomous driving companies, enhancing members' employment opportunities [11][19]. - Regular discussions with industry leaders are organized to explore career paths, job openings, and the latest trends in the autonomous driving field [8][19][92]. - Members are encouraged to engage in research collaborations and internships, particularly for those pursuing advanced degrees in related fields [3][6][16].
模仿学习之外,端到端轨迹如何优化?轻舟一篇刷榜的工作......
自动驾驶之心· 2025-11-10 03:36
Core Insights - The article discusses the development of CATG, a new trajectory generation framework based on flow matching, which addresses limitations in existing end-to-end autonomous driving systems [1][4][22] - CATG achieved a score of 51.31 in the NAVSIM V2 challenge, demonstrating its effectiveness in trajectory planning and robustness against out-of-distribution data [4][22] Background Review - End-to-end multimodal planning has become a key method in autonomous driving, significantly improving robustness and adaptability compared to single trajectory prediction methods [3] - Current multimodal methods often rely on imitation learning, leading to a lack of behavioral diversity due to insufficient strategy diversity in real trajectories [3][6] - Various alternative strategies have been proposed to capture a broader distribution of reasonable trajectories, but many still struggle with integrating safety constraints directly into the generation process [3][6] Proposed Framework - CATG completely abandons imitation learning and supports the flexible injection of explicit constraints during the generation process [4][22] - The framework integrates feasibility and safety constraints into the generation process through a progressive mechanism, utilizing prior perception anchor points [7][22] - CATG allows for controllable trade-offs between aggressive and conservative driving styles by using environmental reward signals as conditional inputs [7][13] Experimental Results - CATG was extensively evaluated in the NAVSIM V2 challenge, showcasing superior planning accuracy and robust generalization capabilities [4][14] - The model's training involved two phases: the first focused on training the flow matching process, and the second on fine-tuning the energy matching process [18][22] - The results indicated high compliance with various metrics, including 100% drivable area compliance and 98.21% no-at-fault collisions in stage one [19] Limitations - The computational cost of generating trajectories through 100-step sampling remains high, and accelerating the sampling process may compromise trajectory quality [21] Conclusion - The article concludes that CATG represents a significant advancement in end-to-end planning for autonomous driving, effectively incorporating flexible conditional signals and explicit constraints during trajectory generation [22]
“中文AI三大顶会”已有两家报导了理想近期AI进展
理想TOP2· 2025-11-09 14:59
Core Insights - The article discusses the rising prominence of Li Auto in the autonomous driving sector, particularly its recent advancements presented at the ICCV 2025 conference, where it introduced a new paradigm for autonomous driving that integrates world models with reinforcement learning [1][2][4]. Group 1: Company Developments - Li Auto's research and development in autonomous driving began in 2021, evolving from initial BEV solutions to more advanced systems [5]. - The company has significantly invested in AI, with nearly half of its R&D budget allocated to this area, indicating a strong commitment to integrating AI into its vehicle technology [2]. - Li Auto's recent presentation at ICCV 2025 highlighted its innovative approach, which combines synthetic data to address rare scenarios, leading to a notable improvement in human takeover mileage (MPI) [2][4]. Group 2: Industry Reception - The reception of Li Auto's advancements has been overwhelmingly positive, with many industry observers praising its research and development efforts, positioning it as a model for Chinese automotive companies [2][4]. - Articles from major Chinese AI platforms like Quantum Bit and Machine Heart have garnered significant attention, with one article achieving over 39,000 reads, reflecting the growing interest in Li Auto's developments [1][2]. Group 3: Competitive Landscape - Li Auto is recognized as a leading player in the Chinese autonomous driving space, with a notable presence in discussions surrounding AI and autonomous vehicle technology [22]. - The company aims to differentiate itself not just as an automotive manufacturer but as a competitive AI entity, aligning its goals with broader AI advancements and the five stages of AI development as defined by OpenAI [18][19].
地平线ResAD:残差学习让自动驾驶决策更接近人类逻辑
自动驾驶之心· 2025-11-07 16:04
Core Insights - The article discusses the limitations of traditional modular approaches in autonomous driving and introduces the ResAD framework, which aims to improve efficiency and safety by using an end-to-end model that focuses on learning necessary adjustments from a baseline trajectory [2][50]. Group 1: Framework Overview - ResAD framework proposes a shift from directly predicting future trajectories to learning the necessary adjustments from a physical baseline trajectory, termed "inertial reference line" [2][50]. - The model focuses on understanding the reasons for trajectory adjustments, such as obstacles and traffic rules, rather than memorizing data correlations [50]. Group 2: Methodology - The ResAD framework incorporates a "normalized residual trajectory modeling" approach, which simplifies the learning problem by defining trajectory predictions as adjustments to a reference line [11][50]. - The framework employs a "point-wise residual normalization" technique to balance the optimization weights of near and far trajectory points, ensuring that critical adjustments are not overlooked [20][50]. Group 3: Testing and Results - Real-world testing demonstrated the effectiveness of the ResAD framework, showcasing its ability to handle complex driving scenarios and respond intelligently to dynamic obstacles [6]. - In benchmark evaluations, ResAD achieved state-of-the-art performance on NAVSIM v1 and v2, with a PDMS score of 88.6 and an EPDMS score of 85.5, indicating high safety and efficiency in route completion [38][39]. Group 4: Comparative Analysis - ResAD outperformed existing models like DiffusionDrive in various metrics, including lane adherence and route completion efficiency, highlighting its superior trajectory generation capabilities [41][39]. - The article emphasizes the importance of the unique trajectory modeling strategy in ResAD, which allows for the generation of contextually relevant and diverse trajectories without relying on a static trajectory library [10][41].