Workflow
Autonomous Driving
icon
Search documents
端到端全新范式!复旦VeteranAD:"感知即规划"刷新开闭环SOTA,超越DiffusionDrive~
自动驾驶之心· 2025-08-21 23:34
点击下方 卡片 ,关注" 自动驾驶之心 "公众号 戳我-> 领取 自动驾驶近30个 方向 学习 路线 今天自动驾驶之心为大家分享复旦大学和上海创新研究院最新的工作 - VeteranAD ! 从"感知–规划"到"感知即规划"的端到端全新范 式。 如果您有相关工作需要分享,请在文末联系我们! 自动驾驶课程学习与技术交流群加入,也欢迎添加小助理微信AIDriver005做进一步咨询 >>自动驾驶前沿信息获取 → 自动驾驶之心知识星球 论文作者 | Bozhou Zhang等 编辑 | 自动驾驶之心 端到端自动驾驶在近几年取得了显著进展,它将多个任务统一到一个框架中,为了避免多个阶段造成的信息损失。通过这种方式,端到端驾驶框架也构建了一个完 全可微分的学习系统,能够实现面向规划的优化。这种设计使得其在 open-loop(开环) 和 closed-loop(闭环) 规划任务中都展现出了不错的表现。 主流的端到端自动驾驶方法通常采用顺序式范式:先执行感知,再执行规划,如图1(a)所示。常见的做法是引入 Transformer 架构,使整个流程保持可微分。然而, 仅仅依靠可微分性并不足以充分发挥端到端规划优化的优势。毕 ...
WeRide Unveils WePilot AiDrive, A One-Stage End-to-End ADAS Targeted for Mass Production in 2025
Globenewswire· 2025-08-21 09:00
Core Insights - WeRide has launched WePilot AiDrive, a one-stage end-to-end ADAS solution, in collaboration with Bosch, marking a significant advancement in autonomous driving technology [1][5] - The new system integrates sensing and decision-making into a single architecture, enhancing response times and operational efficiency [2][5] Product Features - WePilot AiDrive is designed to handle complex driving scenarios, including lane changes in heavy traffic, detours around construction, and interactions with pedestrians [4] - The system offers three main advantages: scalable computing power, adaptability across sensor setups, and rapid daily iteration using extensive driving data [4] Market Position - WeRide is recognized as a leader in the autonomous driving industry, being the first publicly traded Robotaxi company and having tested vehicles in over 30 cities across 10 countries [6] - The company has received autonomous driving permits in six markets, including China and the US, showcasing its regulatory compliance and market reach [6]
VisionTrap: VLM+LLM教会模型利用视觉特征更好实现轨迹预测
自动驾驶之心· 2025-08-20 23:33
作者 | Sakura 编辑 | 自动驾驶之心 原文链接: https://zhuanlan.zhihu.com/p/716867464 点击下方 卡片 ,关注" 自动驾驶之心 "公众号 戳我-> 领取 自动驾驶近30个 方向 学习 路线 >>自动驾驶前沿信息获取 → 自动驾驶之心知识星球 本文只做学术分享,如有侵权,联系删文 VisionTrap: Vision-Augmented Trajectory Prediction Guided by Textual Descriptions 来源 ECCV 2024 开源数据集 在这项工作中,我们提出了一种新方法,该方法还结合了来自环视摄像头的视觉输入,使模型能够利用视觉线索,如人类的凝视和手势、道路状况、车辆转向信号 等,这些线索在现有方法中通常对模型隐藏。此外,我们使用视觉语言模型(VLM)生成并由大型语言模型(LLM)细化的文本描述作为训练期间的监督,以指 导模型从输入数据中学习特征。尽管使用了这些额外的输入,但我们的方法实现了53毫秒的延迟,使其可用于实时处理,这比之前具有类似性能的单代理预测方法 快得多。 我们的实验表明,视觉输入和文本描述都有助于提高 ...
VLM还是VLA?从现有工作看自动驾驶多模态大模型的发展趋势~
自动驾驶之心· 2025-08-20 23:33
点击下方 卡片 ,关注" 自动驾驶之心 "公众号 戳我-> 领取 自动驾驶近30个 方向 学习 路线 近年来,以LLM、VLM和VLA为代表的基础模型在自动驾驶决策中扮演着越来越重要的角色,吸引了学术界和 工业界越来越多的关注。许多小伙伴们询问是否有系统的分类汇总。本文按照模型类别,对决策的基础模型进行 汇总,后续还将进一步梳理相关算法,并第一时间汇总至『自动驾驶之心知识星球』,欢迎大家一起学习交流~ 基于LLM的方法 基于LLM的方法主要是利用大模型的推理能力描述自动驾驶,输入自动驾驶和大模型结合的早期阶段,但仍然 值得学习~ Distilling Multi-modal Large Language Models for Autonomous Driving LearningFlow: Automated Policy Learning Workflow for Urban Driving with Large Language Models CoT-Drive: Efficient Motion Forecasting for Autonomous Driving with LLMs and Chain ...
红色沃土新答卷丨晋察冀抗日根据地·山西阳泉:数字赋能 “煤城”转型“数智新城”
Yang Shi Wang· 2025-08-20 03:49
Group 1 - Yangquan City, located in Shanxi Province, has transformed from a coal-centric economy to a digital and intelligent mining hub, with 95.84% of its coal production now coming from advanced capacity [2][3] - The city has established 12 smart mines, utilizing 5G technology to enhance operational efficiency, resulting in a 50% reduction in underground personnel and a 50% increase in efficiency [3] - Yangquan has become the first city in China to fully open up for autonomous driving, implementing smart traffic management systems that have reduced average vehicle delay rates by 45% and parking frequency by 70% [5] Group 2 - The local government has prioritized the development of the digital economy, establishing platforms such as the China Electric Digital Economy Industrial Park and "Jinchuan Valley·Yangquan," which have accelerated the growth of industries like smart terminals, data security, and big data [7] - In 2024, the core revenue of Yangquan's digital economy is projected to grow by 13.3%, and the city has been recognized as one of the "Top 100 New Smart Cities in China" for 2023-2024 [7]
自动驾驶一周论文精选!端到端、VLA、感知、决策等~
自动驾驶之心· 2025-08-20 03:28
Core Viewpoint - The article emphasizes the recent advancements in autonomous driving research, highlighting various innovative approaches and frameworks that enhance the capabilities of autonomous systems in dynamic environments [2][4]. Group 1: End-to-End Autonomous Driving - The article discusses several notable papers focusing on end-to-end autonomous driving, including GMF-Drive, ME³-BEV, SpaRC-AD, IRL-VLA, and EvaDrive, which utilize advanced techniques such as gated fusion, deep reinforcement learning, and evolutionary adversarial strategies [8][10]. Group 2: Perception and VLM - The VISTA paper introduces a vision-language model for predicting driver attention in dynamic environments, showcasing the integration of visual and language processing for improved situational awareness [7]. - The article also mentions the development of safety-critical perception technologies, such as the progressive BEV perception survey and the CBDES MoE model for functional module decoupling [10]. Group 3: Simulation Testing - The article highlights the ReconDreamer-RL framework, which enhances reinforcement learning through diffusion-based scene reconstruction, indicating a trend towards more sophisticated simulation testing methodologies [11]. Group 4: Datasets - The STRIDE-QA dataset is introduced as a large-scale visual question answering resource aimed at spatiotemporal reasoning in urban driving scenarios, reflecting the growing need for comprehensive datasets in autonomous driving research [12].
都在做端到端了,轨迹预测还有出路么?
自动驾驶之心· 2025-08-19 03:35
Core Viewpoint - The article emphasizes the importance of trajectory prediction in the context of autonomous driving and highlights the ongoing relevance of traditional two-stage and modular methods despite the rise of end-to-end approaches. It discusses the integration of trajectory prediction models with perception models as a form of end-to-end training, indicating a significant area of research and application in the industry [1][2]. Group 1: Trajectory Prediction Methods - The article introduces the concept of multi-agent trajectory prediction, which aims to forecast future movements based on the historical trajectories of multiple interacting agents. This is crucial for applications in autonomous driving, intelligent monitoring, and robotic navigation [1]. - It discusses the challenges of predicting human behavior due to its uncertainty and multimodality, noting that traditional methods often rely on recurrent neural networks, convolutional networks, or graph neural networks for social interaction modeling [1]. - The article highlights the advancements in diffusion models for trajectory prediction, showcasing models like Leapfrog Diffusion Model (LED) and Mixed Gaussian Flow (MGF) that have significantly improved accuracy and efficiency in various datasets [2]. Group 2: Course Objectives and Structure - The course aims to provide a systematic understanding of trajectory prediction and diffusion models, helping participants to integrate theoretical knowledge with practical coding skills, ultimately leading to the development of new models and research papers [6][8]. - It is designed for individuals at various academic levels who are interested in trajectory prediction and autonomous driving, offering insights into cutting-edge research and algorithm design [8]. - Participants will gain access to classic and cutting-edge papers, coding implementations, and methodologies for writing and submitting research papers [8][9]. Group 3: Course Highlights and Requirements - The course features a "2+1" teaching model with experienced instructors and dedicated support staff to enhance the learning experience [16][17]. - It requires participants to have a foundational understanding of deep learning and proficiency in Python and PyTorch, ensuring they can engage with the course material effectively [10]. - The course structure includes a comprehensive curriculum covering data sets, baseline codes, and essential research papers, facilitating a thorough understanding of trajectory prediction techniques [20][21][23].
自动驾驶秋招交流群成立了!
自动驾驶之心· 2025-08-18 23:32
Core Viewpoint - The article emphasizes the convergence of autonomous driving technology, indicating a shift from numerous diverse approaches to a more unified model, which raises the technical barriers in the industry [1] Group 1 - The industry is witnessing a trend where previously many directions requiring algorithm engineers are now consolidating into unified models such as one model, VLM, and VLA [1] - The article encourages the establishment of a large community to support individuals in the industry, highlighting the limitations of individual efforts [1] - A new job and industry-related community is being launched to facilitate discussions on industry trends, company developments, product research, and job opportunities [1]
性能暴涨4%!CBDES MoE:MoE焕发BEV第二春,性能直接SOTA(清华&帝国理工)
自动驾驶之心· 2025-08-18 23:32
Core Viewpoint - The article discusses the CBDES MoE framework, a novel modular expert mixture architecture designed for BEV perception in autonomous driving, addressing challenges in adaptability, modeling capacity, and generalization in existing methods [2][5][48]. Group 1: Introduction and Background - The rapid development of autonomous driving technology has made 3D perception essential for building safe and reliable driving systems [5]. - Existing solutions often use fixed single backbone feature extractors, limiting adaptability to diverse driving environments [5][6]. - The MoE paradigm offers a new solution by enabling dynamic expert selection based on learned routing mechanisms, balancing computational efficiency and representational richness [6][9]. Group 2: CBDES MoE Framework - CBDES MoE integrates multiple structurally heterogeneous expert networks and employs a lightweight self-attention router (SAR) for dynamic expert path selection [3][12]. - The framework includes a multi-stage heterogeneous backbone design pool, enhancing scene adaptability and feature representation [14][17]. - The architecture allows for efficient, adaptive, and scalable 3D perception, outperforming strong single backbone baseline models in complex driving scenarios [12][14]. Group 3: Experimental Results - In experiments on the nuScenes dataset, CBDES MoE achieved a mean Average Precision (mAP) of 65.6 and a NuScenes Detection Score (NDS) of 69.8, surpassing all single expert baselines [37][39]. - The model demonstrated faster convergence and lower loss throughout training, indicating higher optimization stability and learning efficiency [39][40]. - The introduction of load balancing regularization significantly improved performance, with the mAP increasing from 63.4 to 65.6 when applied [42][46]. Group 4: Future Work and Limitations - Future research may explore patch-wise or region-aware routing for finer granularity in adaptability, as well as extending the method to multi-task scenarios [48]. - The current routing mechanism operates at the image level, which may limit its effectiveness in more complex environments [48].
Pony.ai Attracts Premium Capital as Funds Chase the Next Tech Transformation
Prnewswire· 2025-08-18 13:53
Core Insights - Leading investment management firms, including ARK Invest, have invested significantly in Pony.ai, marking a notable interest in the Chinese autonomous driving sector [1][2] - Pony.ai has reported substantial growth in robotaxi revenues and is on a clear path to profitability, attracting attention from major institutional investors [4][8] Investment Activity - ARK Invest invested approximately US$12.9 million in Pony.ai, marking its first investment in a Chinese firm focused on Level 4 autonomous driving technology [1] - At least 14 major global institutional investors backed Pony.ai in Q2, including Baillie Gifford and Nikko Asset Management, despite a general trend of U.S. investors moving away from Chinese assets [2] Market Potential - ARK's "Big Ideas 2025" report projects the ride-hailing market could reach US$10 trillion by 2030, with global robotaxi fleets potentially hitting around 50 million vehicles [3] - UBS analysts expect the robotaxi market value to reach US$183 billion in China and US$394 billion internationally by the late 2030s [9] Company Performance - Pony.ai reported a 158% year-on-year increase in robotaxi revenues in Q2, driven by the production of its seventh-generation robotaxi models [4] - The company aims to scale its fleet to 1,000 robotaxis by year-end, which is expected to achieve positive unit economics [5] Operational Efficiency - The Gen-7 vehicle has a 70% lower cost compared to its predecessor, with significant reductions in operational costs, including an 18% decrease in insurance costs [5] - Pony.ai has received commercial permits for fare-charging services in Shanghai and operates 24/7 in Guangzhou and Shenzhen [6][7] Analyst Sentiment - Following the Q2 earnings release, major institutions like Goldman Sachs and UBS rated Pony.ai's stock as "buy," with Goldman setting a price target of US$24.5, indicating a 54.5% upside [8]