端到端自动驾驶

Search documents
从25年顶会论文方向看后期研究热点是怎么样的?
自动驾驶之心· 2025-07-06 08:44
Core Insights - The article highlights the key research directions in computer vision and autonomous driving as presented at major conferences CVPR and ICCV, focusing on four main areas: general computer vision, autonomous driving, embodied intelligence, and 3D vision [2][3]. Group 1: Research Directions - In the field of computer vision and image processing, the main research topics include diffusion models, image quality assessment, semi-supervised learning, zero-shot learning, and open-world detection [3]. - Autonomous driving research is concentrated on end-to-end systems, closed-loop simulation, 3D ground segmentation (3DGS), multimodal large models, diffusion models, world models, and trajectory prediction [3]. - Embodied intelligence focuses on visual language navigation (VLA), zero-shot learning, robotic manipulation, end-to-end systems, sim-to-real transfer, and dexterous grasping [3]. - The 3D vision domain emphasizes point cloud completion, single-view reconstruction, 3D ground segmentation (3DGS), 3D matching, video compression, and Neural Radiance Fields (NeRF) [3]. Group 2: Research Support and Collaboration - The article offers support for various research needs in autonomous driving, including large models, VLA, end-to-end autonomous driving, 3DGS, BEV perception, target tracking, and multi-sensor fusion [4]. - In the embodied intelligence area, support is provided for VLA, visual language navigation, end-to-end systems, reinforcement learning, diffusion policy, sim-to-real, embodied interaction, and robotic decision-making [4]. - For 3D vision, the focus is on point cloud processing, 3DGS, and SLAM [4]. - General computer vision support includes diffusion models, image quality assessment, semi-supervised learning, and zero-shot learning [4].
本来决定去具身,现在有点犹豫了。。。
自动驾驶之心· 2025-07-05 09:12
Core Insights - The article discusses the evolving landscape of embodied intelligence, highlighting its transition from a period of hype to a more measured approach as the technology matures and is not yet at a productivity stage [2]. Group 1: Industry Trends - Embodied intelligence has gained significant attention over the past few years, but the industry is now recognizing that it is still in the early stages of development [2]. - There is a growing demand for skills in multi-sensor fusion and robotics, particularly in areas like SLAM and ROS, which are crucial for engaging with embodied intelligence [3][4]. - Many companies in the robotics sector are rapidly developing, with numerous startups receiving substantial funding, indicating a positive outlook for the industry in the coming years [3][4]. Group 2: Job Market and Skills Development - The job market for algorithm positions is competitive, with a focus on cutting-edge technologies such as end-to-end models, VLA, and reinforcement learning [3]. - Candidates with a background in robotics and a solid understanding of the latest technologies are likely to find opportunities, especially as traditional robotics remains a primary product line [4]. - The article encourages individuals to enhance their technical skills in robotics and embodied intelligence to remain competitive in the job market [3][4]. Group 3: Community and Resources - The article promotes a community platform that offers resources for learning about autonomous driving and embodied intelligence, including video courses and job postings [5]. - The community aims to gather a large number of professionals and students interested in smart driving and embodied intelligence, fostering collaboration and knowledge sharing [5]. - The platform provides access to the latest industry trends, technical discussions, and job opportunities, making it a valuable resource for those looking to enter or advance in the field [5].
今年,传统规划控制怎么找工作?
自动驾驶之心· 2025-07-02 13:54
点击下方 卡片 ,关注" 自动驾驶之心 "公众号 戳我-> 领取 自动驾驶近15个 方向 学习 路线 自动驾驶规划控制已经不再单单是逻辑兜底的岗位了。现在端到端、VLA的量产趋势下,传统规划控制的 生存空间正在慢慢被蚕食。。。 不少同学也表示,现在面试更看重规则算法+端到端的结合,两者缺一不可。 以下是最近一位打算转行同学的提问,非常具有代表性: 我想转行到自动驾驶规划控制岗位,研究生期间做过粒子群的无人机航迹规划,工作之后做的内容比 较传统。工作三年后还想换到自动驾驶行业。我给自己留了3-4个月准备时间,我刷了运动规划的课 程,我把C++ pimer和leetcode刷一遍,除此之外,请各位已经在自动驾驶规划控制的前辈指教,我还 需要学习什么?重点准备什么?非常感谢! 自动驾驶之心邀请到规划控制算法专家宁远老师回答这个问题: 这个同学的问题非常好,很有现实意义。在我日常工作中,规则的算法作为兜底仍然是十分重要的。但坦 白说,这些东西已经是从业人员的基础了,诸如横纵联合、横纵解耦这些决策规划框架,一些基础的规划 算法(基于搜索/采样基于运动学等等),这些都已经是基本的面试要求了,面试中如果被问到答不上来会 很 ...
不用给理想入选ICCV高评价, 牛的是理想的工作, 不是ICCV
理想TOP2· 2025-06-29 15:06
Core Viewpoint - The article discusses the unique characteristics of the AI academic community compared to other disciplines, highlighting the rapid growth and the implications for the quality and significance of research papers submitted to top conferences [5][7][8]. Group 1: Characteristics of AI Academic Community - AI conferences are more important than journals due to the fast-paced development of AI, which makes the lengthy journal review process inadequate [5]. - The number of submissions and acceptances to top AI conferences has significantly increased over the past decade, with acceptance rates declining, indicating a surge in competition [5][7]. - The rapid increase in submissions has led to a shortage of qualified reviewers, resulting in a decline in the quality of accepted papers [8]. Group 2: Implications for Research Quality - The increase in accepted papers does not guarantee high-quality research, as many accepted papers may lack substantial contributions [8]. - The job market for AI researchers is becoming increasingly competitive, with the demand for high-quality publications rising faster than the availability of quality positions [8]. Group 3: Company-Specific Insights - Li Auto's recent achievement of having multiple papers accepted at ICCV is used as a promotional tool to showcase its advancements in assisted driving technology [9]. - The original innovation level of Li Auto's VLA is compared to DeepSeek's MoE level, indicating that few Chinese companies can achieve such a high level of innovation [11][12]. - Li Auto's approach to autonomous driving has evolved from following Tesla to developing its unique systems, particularly in the integration of fast and slow systems in its VLM [12][13].
华为车BU招聘(端到端/感知模型/模型优化等)!岗位多多~
自动驾驶之心· 2025-06-24 07:21
Core Viewpoint - The article emphasizes the rapid evolution and commercialization of autonomous driving technologies, highlighting the importance of community engagement and knowledge sharing in this field [9][14][19]. Group 1: Job Opportunities and Community Engagement - Huawei is actively recruiting for various positions in its autonomous driving division, including roles focused on end-to-end model algorithms, perception models, and efficiency optimization [1][2]. - The "Autonomous Driving Heart Knowledge Planet" serves as a platform for technical exchange, targeting students and professionals in the autonomous driving and AI sectors, and has established connections with numerous industry companies for job referrals [7][14][15]. Group 2: Technological Trends and Future Directions - The article outlines that by 2025, the focus will be on advanced technologies such as visual large language models (VLM), end-to-end trajectory prediction, and 3D generative simulations, indicating a shift towards more integrated and intelligent systems in autonomous driving [9][22]. - The community has developed over 30 learning pathways covering various subfields of autonomous driving, including perception, mapping, and AI model deployment, which are crucial for industry professionals [19][21]. Group 3: Educational Resources and Content - The knowledge platform offers exclusive rights to members, including access to academic advancements, professional Q&A sessions, and discounts on courses, fostering a comprehensive learning environment [17][19]. - Regular webinars featuring experts from top conferences and companies are organized to discuss practical applications and research in autonomous driving, enhancing the learning experience for participants [21][22].
端到端系列!SpareDrive:基于稀疏场景表示的端到端自动驾驶~
自动驾驶之心· 2025-06-23 11:34
Core Viewpoint - The article discusses the limitations of existing end-to-end methods in autonomous driving, particularly the computational intensity of BEV paradigms and the inefficiency of sequential prediction and planning approaches. It proposes a new Sparse paradigm that allows for parallel processing of prediction and planning tasks [2][5]. Group 1: SparseDrive Methodology - SparseDrive adopts the core ideas from the previous Horizon Sparse series, focusing on sparse scene representation for autonomous driving [3]. - The proposed method modifies the similarities between motion prediction and planning, introducing a hierarchical planning selection strategy [5]. - The architecture includes features such as symmetric sparse perception and a parallel motion planner [5]. Group 2: Training and Performance - The training loss function for SparseDrive is defined as a combination of detection, mapping, motion, planning, and depth losses [9]. - Performance comparisons show that SparseDrive-S achieves a mean Average Precision (mAP) of 0.418, while SparseDrive-B reaches 0.496, outperforming other methods like UniAD [11]. - In motion prediction and planning, SparseDrive-S and SparseDrive-B demonstrate significant improvements in metrics such as minADE and minFDE compared to traditional methods [18]. Group 3: Efficiency Comparison - SparseDrive exhibits superior training and inference efficiency, requiring only 15.2 GB of GPU memory and achieving 9.0 FPS during inference, compared to UniAD's 50.0 GB and 1.8 FPS [20]. - The method's reduced computational requirements make it more accessible for real-time applications in autonomous driving [20]. Group 4: Course and Learning Opportunities - The article promotes a course focused on end-to-end autonomous driving algorithms, covering foundational knowledge, practical implementations, and various algorithmic approaches [29][41]. - The course aims to equip participants with the skills necessary to understand and implement end-to-end solutions in the autonomous driving industry [54][56].
自动驾驶端到端VLA落地,算法如何设计?
自动驾驶之心· 2025-06-22 14:09
Core Insights - The article discusses the rapid advancements in end-to-end autonomous driving, particularly focusing on Vision-Language-Action (VLA) models and their applications in the industry [2][3]. Group 1: VLA Model Developments - The introduction of AutoVLA, a new VLA model that integrates reasoning and action generation for end-to-end autonomous driving, shows promising results in semantic reasoning and trajectory planning [3][4]. - ReCogDrive, another VLA model, addresses performance issues in rare and long-tail scenarios by utilizing a three-stage training framework that combines visual language models with diffusion planners [7][9]. - Impromptu VLA introduces a dataset aimed at improving VLA models' performance in unstructured extreme conditions, demonstrating significant performance improvements in established benchmarks [14][24]. Group 2: Experimental Results - AutoVLA achieved competitive performance metrics in various scenarios, with the best-of-N method reaching a PDMS score of 92.12, indicating its effectiveness in planning and execution [5]. - ReCogDrive set a new state-of-the-art PDMS score of 89.6 on the NAVSIM benchmark, showcasing its robustness and safety in driving trajectories [9][10]. - The OpenDriveVLA model demonstrated superior results in open-loop trajectory planning and driving-related question-answering tasks, outperforming previous methods on the nuScenes dataset [28][32]. Group 3: Industry Trends - The article highlights a trend among major automotive manufacturers, such as Li Auto, Xiaomi, and XPeng, to invest heavily in VLA model research and development, indicating a competitive landscape in autonomous driving technology [2][3]. - The integration of large language models (LLMs) with VLA frameworks is becoming a focal point for enhancing decision-making capabilities in autonomous vehicles, as seen in models like ORION and VLM-RL [33][39].
商汤绝影世界模型负责人离职。。。
自动驾驶之心· 2025-06-21 13:15
Core Viewpoint - The article discusses the challenges and opportunities faced by SenseTime's autonomous driving division, particularly focusing on the competitive landscape and the importance of technological advancements in the industry. Group 1: Company Developments - The head of the world model development for SenseTime's autonomous driving division has left the company, which raises concerns about the future of their cloud technology system and the R-UniAD generative driving solution [2][3]. - SenseTime's autonomous driving division has successfully delivered a mid-tier solution based on the J6M model to GAC Trumpchi, but the mid-tier market is expected to undergo significant upgrades this year [4]. Group 2: Market Dynamics - The mid-tier market will see a shift from highway-based NOA (Navigation on Autopilot) to full urban NOA, which represents a major change in the competitive landscape [4]. - Leading companies are introducing lightweight urban NOA solutions based on high-tier algorithms, targeting chips with around 100 TOPS computing power, which are already being demonstrated to OEM clients [4]. Group 3: High-Tier Strategy - The key focus for SenseTime this year is the one-stage end-to-end solution, which has shown impressive performance and is a requirement for high-tier project tenders from OEMs [5]. - Collaborations with Dongfeng Motor aim for mass production and delivery of the UniAD one-stage end-to-end solution by Q4 2025, marking a critical opportunity for SenseTime to establish a foothold in the high-tier market [5][6]. Group 4: Competitive Landscape - SenseTime's ability to deliver a benchmark project in the high-tier segment is crucial for gaining credibility with OEMs and securing additional projects [6][7]. - The current window of opportunity for SenseTime in the high-tier market is limited, as many models capable of supporting high-tier software and hardware costs are being released this year [6][8].
CVPR'25端到端冠军方案!GTRS:可泛化多模态端到端轨迹规划(英伟达&复旦)
自动驾驶之心· 2025-06-19 10:47
今天自动驾驶之心为大家分享 英伟达、复旦大学 最新的工作! GTRS:可泛化的 多模式端到端轨迹规划! 如果您有相关工作需要分享,请在文末联系我们! 自动驾驶课程学习与技术交流群事宜,也欢迎添加小助理微信AIDriver004做进一 步咨询 >>点击进入→ 自动驾驶之心 『端到端自动驾驶』技术交流群 论文作者 | Zhenxin Li等 编辑 | 自动驾驶之心 论文链接:https://arxiv.org/abs/2506.06664 Github:https://github.com/NVlabs/GTRS NVIDIA技术博客:https://blogs.nvidia.com/blog/auto-research-cvpr-2025/?ncid=so-nvsh-677066 CVPR 2025 Autonomous Grand Challenge: https://opendrivelab.com/legacy/challenge2025/index.html 点击下方 卡片 ,关注" 自动驾驶之心 "公众号 戳我-> 领取 自动驾驶近15个 方向 学习 路线 端到端自动驾驶挑战赛背景 NAVSIM v2 ...
理想一篇论文入选近半年端到端自动驾驶推荐度最高的10篇论文
理想TOP2· 2025-06-18 11:43
以下文章来源于深蓝AI ,作者深蓝学院 深蓝AI . 专注于人工智能、机器人与自动驾驶的学习平台。 近半年端到端自动驾驶推荐度最高的10篇论文,是由深蓝AI调研了数十位自动驾驶一线研究者后得出的。 深蓝AI自己的定位是人工智能、机器人与自动驾驶的学习平台,受众是相关的技术从业人员。 原标题是" 盘点|近半年「端到端自动驾驶」推荐度最高的10篇论文 ",并没有带理想,这10篇论文本身也是去中心化排列的,不存在理想公关的 部分。 TOP2非常明显得感知到过去一年,理想在面向自动驾驶从业群体的号的存在感越来越强,甚至可以说,如果是面向母语为中文的自动驾驶从业群 体的号,在过去一年,已经不可能不多发几篇理想的内容。目前理想在面向AI从业群体的号的存在感,还不算特别强,还是有不少AI号主对理想做 AI感知不强。 额外提醒读者留意3点: 1. 李想在24Q4电话会议上比喻端到端是猴子开车,VLM是副驾的人类,给猴子一些指令,VLA是主驾就是人类在开车。即我们合情预期,VLA的 拟人感可以明显比VLM上一个台阶。 从技术架构来说,VLM是两个系统,系统1本质是通过模仿学习的方式端出轨迹,不具备任何语意理解的能力。(对应猴子 ...