Workflow
端到端自动驾驶技术
icon
Search documents
中游智驾厂商正在快速抢占端到端人才......
自动驾驶之心· 2025-12-15 00:04
点击下方 卡片 ,关注" 自动驾驶之心 "公众号 戳我-> 领取 自动驾驶近30个 方向 学习 路线 智驾的技术焦虑,正在中游厂商快速传播。 周末有机会和一位深耕主机厂L2量产交付的负责人线下交流,其认为 明年才是端到端等前沿技术大规模量产的起点。 智驾前沿的技术发展放缓,业内量产方案趋同,L2整体在走下沉路线。二十万以上的乘用车销量在700万左右,但头部新势力的销量不及1/3,更不用说端到端量产 占比的车型。从落地趋势上来看,端到端技术的成熟反而才是更大规模量产的开端。随着明年L3法规的进一步推进, 中游厂商的技术升级也是迫在眉睫。 所以这 两个月很多公司算法负责人联系自动驾驶之心,迫切的想要了解前沿的技术:端到端、世界模型、VLA、3DGS等等。 端到端不仅仅是一个算法,需要完善的云端&车端基建,数据闭环、工程部署、闭环测试、模型优化、平台开发等等,可以预见,中阶智能驾驶的岗位需求会更旺 盛。而在昨天的2025地平线技术生态大会上,地平线CEO也表示将挺进10万级市场,高阶智驾正在迅速下沉至更多的国民车型。明年,智能驾驶的故事将更精彩。 以上。 基本上可以判断端到端、VLA的招聘需求会更旺盛。最近几个月, ...
上交最新!端到端&VLA综述:广义范式下的统一视角
自动驾驶之心· 2025-12-11 00:05
Core Viewpoint - The article discusses the evolution of autonomous driving technology, emphasizing the need for a unified perspective on various paradigms, including end-to-end (E2E), VLM-centric, and hybrid approaches, to enhance understanding and performance in complex driving scenarios [2][4][14]. Group 1: Introduction and Background - Traditional modular approaches in autonomous driving have led to information loss and error accumulation due to task fragmentation, prompting a shift towards data-driven end-to-end architectures [5][10]. - The article introduces a comprehensive review titled "Survey of General End-to-End Autonomous Driving: A Unified Perspective," which aims to bridge the gap in understanding between different paradigms [3][4]. Group 2: Paradigms of Autonomous Driving - General End-to-End (GE2E) is defined as any model that processes raw sensor inputs into planning trajectories or control actions, regardless of whether it includes visual-language models (VLM) [4][14]. - The three main paradigms unified under GE2E are: - Traditional End-to-End (Conventional E2E), which relies on structured scene representation for precise trajectory planning [9][17]. - VLM-centric End-to-End, which utilizes pre-trained visual-language models to enhance generalization and reasoning capabilities in complex scenarios [11][33]. - Hybrid End-to-End, which combines the strengths of both traditional and VLM-centric approaches to balance high-level semantic understanding with low-level control precision [12][39]. Group 3: Performance Comparison - In open-loop performance tests, the hybrid paradigm outperformed others, demonstrating the importance of world knowledge in handling long-tail scenarios [54]. - Traditional E2E methods still dominate in numerical trajectory prediction accuracy, indicating their robustness in structured environments [54]. - In closed-loop performance, traditional methods maintain a stronghold, particularly in complex driving tasks, while VLA methods show potential but require further refinement in fine-grained trajectory control [55][56]. Group 4: Data and Learning Strategies - The evolution of datasets from geometric annotations to semantic-rich datasets is crucial for training models capable of logical reasoning and understanding complex traffic contexts [46][48]. - The introduction of Chain of Thought (CoT) annotations in datasets supports advanced reasoning tasks, moving beyond simple input-output mappings [47]. Group 5: Model Architecture and Details - The article provides a detailed comparison of mainstream model architectures, including their inputs, backbone networks, intermediate tasks, and output forms, to clarify the distinctions among different paradigms [57].
中游智驾厂商,正在快速抢占端到端人才......
自动驾驶之心· 2025-12-09 00:03
Core Viewpoint - The article discusses the technological anxiety in intelligent driving, particularly among mid-tier manufacturers, and highlights the anticipated growth in demand for end-to-end (E2E) and VLA (Vision-Language-Action) technologies in the coming year [2]. Group 1: Industry Trends - The mass production of cutting-edge technologies like end-to-end systems is expected to begin next year, with L2 technology becoming more standardized and moving towards lower-tier markets [2]. - The total sales of passenger vehicles priced above 200,000 are around 7 million, but leading new forces account for less than one-third of this, indicating a slow adoption of end-to-end mass production models [2]. - The maturity of end-to-end technology is seen as a precursor to larger-scale production, with the advancement of L3 regulations prompting urgent upgrades among mid-tier manufacturers [2]. Group 2: Recruitment and Training - There is a growing demand for positions related to end-to-end and VLA technologies, as many professionals are seeking to quickly learn these advanced skills [3]. - The article mentions the launch of specialized courses aimed at practical applications of end-to-end and VLA technologies, designed for individuals already working in the field [3][6]. - The courses will cover various modules, including navigation information application, reinforcement learning optimization, and production experiences related to diffusion and autoregressive models [3][6]. Group 3: Course Details - The end-to-end production course will focus on practical implementation, including seven major practical applications, making it suitable for those looking to advance their careers [3][6]. - The VLA course will cover foundational algorithms and theories, including BEV perception and large language models, with practical projects based on diffusion models and VLA algorithms [6][11]. - The instructors for these courses are experienced professionals from top-tier companies and academic institutions, ensuring a high-quality learning experience [5][8][13].
端到端时代下的自动驾驶感知
自动驾驶之心· 2025-12-05 00:03
Core Insights - The article discusses the resurgence of end-to-end (E2E) perception in the autonomous driving industry, highlighting its impact on the field and the shift from traditional modular approaches to more integrated solutions [4][5][9]. Group 1: End-to-End Revival - End-to-end is not a new technology; it was initially hoped to directly use neural networks to output trajectories from camera images, but stability and safety were issues [9]. - The traditional architecture of localization, perception, planning, and control has been the mainstream approach, but advancements in BEV perception and Transformer architectures have revived end-to-end methods [9]. - Companies are now exploring various one-stage and two-stage solutions, with a focus on neural network-based planning modules [9]. Group 2: Perception Benefits in End-to-End - In traditional frameworks, perception aimed to gather as much accurate scene information as possible for planning, but this modular design limited the ability to meet planning needs [11]. - Current mainstream end-to-end solutions continue to follow this approach, treating various perception tasks as auxiliary losses [13]. - The key advantage of end-to-end is the shift from exhaustive perception to "Planning-Oriented" perception, allowing for a more efficient and demand-driven approach [14][15]. Group 3: Navigation-Guided Perception - The article introduces a Navigation-Guided Perception model, which suggests that perception should be guided by navigation information, similar to how human drivers focus on relevant scene elements based on driving intent [16][18]. - A Scene Token Learner (STL) module is proposed to efficiently extract scene features based on BEV characteristics, integrating navigation information to enhance perception [18][19]. - The SSR framework demonstrates that only 16 self-supervised queries can effectively represent the necessary perception information for planning tasks, significantly reducing the complexity compared to traditional methods [22]. Group 4: World Models and Implicit Supervision - The article discusses the potential of world models to replace traditional perception tasks, providing implicit supervision for scene representation [23][21]. - The SSR framework aims to enhance understanding of scenes through self-supervised learning, predicting future BEV features to improve scene query comprehension [20][21]. - The design allows for efficient trajectory planning while maintaining consistency for model convergence during training [20]. Group 5: Performance Metrics - The SSR framework outperforms various state-of-the-art (SOTA) methods in both efficiency and performance, achieving significant improvements in metrics such as L2 distance and collision rates [24]. - The framework's design allows for a reduction in the number of queries needed for effective scene representation, showcasing its scalability and efficiency [22][24].
正式结课!工业界大佬带队三个月搞定端到端自动驾驶
自动驾驶之心· 2025-10-27 00:03
Core Viewpoint - 2023 marks the year of end-to-end production, with 2024 expected to be a significant year for end-to-end production in the automotive industry, as leading new forces and manufacturers have already achieved end-to-end production [1][3]. Group 1: End-to-End Production Development - The automotive industry is witnessing rapid development in end-to-end methods, particularly the one-stage approach exemplified by UniAD, which directly models vehicle trajectories from sensor inputs [1][3]. - There are two main paradigms in the industry: one-stage and two-stage methods, with the one-stage approach gaining traction and leading to various derivatives based on perception, world models, diffusion models, and VLA [3][5]. Group 2: Course Overview - A course titled "End-to-End and VLA Autonomous Driving" has been launched, focusing on cutting-edge algorithms in both one-stage and two-stage end-to-end methods, aimed at bridging academic and industrial advancements [5][15]. - The course is structured into several chapters, covering the history and evolution of end-to-end methods, background knowledge on VLA, and detailed discussions on both one-stage and two-stage approaches [9][10][12]. Group 3: Key Technologies - The course emphasizes critical technologies such as BEV perception, visual language models (VLM), diffusion models, and reinforcement learning, which are essential for mastering the latest advancements in autonomous driving [5][11][19]. - The second chapter of the course is highlighted as containing the most frequently asked technical keywords for job interviews in the next two years [10]. Group 4: Practical Applications - The course includes practical assignments, such as RLHF fine-tuning, allowing participants to apply their knowledge in real-world scenarios and understand how to build and experiment with pre-trained and reinforcement learning modules [13][19]. - The curriculum also covers various subfields of one-stage end-to-end methods, including those based on perception, world models, diffusion models, and VLA, providing a comprehensive understanding of the current landscape in autonomous driving technology [14][19].
模仿学习无法真正端到端?
自动驾驶之心· 2025-10-08 23:33
Core Viewpoint - The article emphasizes that in the autonomous driving industry, the training methods are more critical than model architectures like VLA or world models, highlighting the limitations of imitation learning in achieving true end-to-end autonomous driving [2][14]. Limitations of Imitation Learning - Imitation learning assumes that expert data is optimal, but in the context of driving, there is no single perfect driving behavior due to the diverse styles and strategies of human drivers [3][4]. - The training data lacks consistency and optimality, leading to models that learn vague and imprecise driving patterns rather than clear and logical strategies [3][4]. - Imitation learning fails to distinguish between critical decision-making scenarios and ordinary ones, resulting in models that may make fatal errors in crucial moments [5][6]. Key Scene Identification - The article discusses the importance of identifying key scenes in driving, where the model's output precision is critical, especially in complex scenarios [7][8]. - It introduces the concept of "advantage" from reinforcement learning, which helps define key states where optimal actions significantly outperform others [7]. Out-of-Distribution (OOD) Issues - Open-loop imitation learning can lead to cumulative errors, causing the model to enter states that differ from the training data distribution, resulting in performance degradation [8][10][12]. - The article illustrates that models trained purely on imitation learning may struggle in critical situations, such as timely lane changes, due to their reliance on suboptimal behaviors learned from human data [13]. Conclusion - The core of technological development lies in identifying key routes and bottlenecks rather than merely following trends, suggesting a need for new methods beyond imitation learning to address its limitations [14].
死磕技术的自动驾驶黄埔军校,三年了~
自动驾驶之心· 2025-08-28 03:22
Core Viewpoint - The article emphasizes the establishment of a comprehensive community for autonomous driving enthusiasts, aiming to facilitate knowledge sharing, technical discussions, and job opportunities in the field of autonomous driving and AI [1][13]. Group 1: Community Development - The "Autonomous Driving Heart Knowledge Planet" has grown to over 4,000 members, with a goal to reach nearly 10,000 in the next two years, providing a platform for exchange and technical sharing [1]. - The community offers a variety of resources, including video content, articles, learning paths, Q&A sessions, and job exchange opportunities [1][2]. Group 2: Learning Resources - The community has organized nearly 40 technical routes for members, covering various aspects of autonomous driving, including end-to-end learning, multi-modal models, and data annotation practices [2][5]. - A complete learning stack and roadmap for beginners have been prepared, making it suitable for those with no prior experience [7][9]. Group 3: Industry Insights - The community regularly invites industry leaders and experts to discuss trends in autonomous driving, technology directions, and production challenges [4][62]. - Members can engage in discussions about job opportunities, industry developments, and academic advancements, fostering a collaborative environment [59][64]. Group 4: Technical Focus Areas - Key focus areas include end-to-end autonomous driving, multi-sensor fusion, 3DGS, and NeRF technologies, with detailed resources and discussions available for each topic [31][32][33]. - The community also provides insights into the latest advancements in visual language models (VLM) and their applications in autonomous driving [35][36].
最近被公司通知不续签了。。。
自动驾驶之心· 2025-08-17 03:23
Core Insights - The smart driving industry is currently in a critical phase of competing on technology and cost, with many companies struggling to survive in 2024, although the overall environment has improved slightly this year [2][6] - Traditional planning and control (规控) has matured over the past decade, and professionals in this field need to continuously update their technical skills to remain competitive [7][8] Group 1: Industry Trends - The smart driving sector has faced significant challenges, with many companies unable to endure the tough conditions last year, but some, like Xiaopeng, have found a way to thrive [6] - The price war in the industry has been curtailed by government intervention, yet competition remains fierce [6] Group 2: Career Guidance - For professionals in traditional planning and control, it is advisable to continue in their current roles while also learning new technologies, particularly in emerging areas like end-to-end models and large models [7][8] - There is a growing trend of professionals transitioning from traditional planning and control to end-to-end and large model applications, with many finding success in these new areas [8] Group 3: Community and Resources - The "Automated Driving Heart Knowledge Planet" community offers a platform for technical exchange, featuring members from renowned universities and leading companies in the smart driving field [21] - The community provides access to a wealth of resources, including over 40 technical routes, open-source projects, and job opportunities in the automated driving sector [19][21]
传统感知和规控,打算转端到端VLA了...
自动驾驶之心· 2025-07-28 03:15
Core Viewpoint - The article emphasizes the shift in research focus from traditional perception and planning methods to end-to-end Vision-Language-Action (VLA) models in the autonomous driving field, highlighting the emergence of various subfields and the need for researchers to adapt to these changes [2][3]. Group 1: VLA Research Directions - The end-to-end development has led to the emergence of multiple technical subfields, categorized into one-stage and two-stage end-to-end approaches, with examples like PLUTO and UniAD [2]. - Traditional fields such as BEV perception and multi-sensor fusion are becoming mature, while the academic community is increasingly focusing on large models and VLA [2]. Group 2: Research Guidance and Support - The program offers structured guidance for students in VLA and autonomous driving, aiming to help them systematically grasp key theoretical knowledge and develop their own research ideas [7][10]. - The course includes a comprehensive curriculum covering classic and cutting-edge papers, coding implementation, and writing methodologies, ensuring students can produce a solid research paper [8][11]. Group 3: Enrollment and Requirements - The program is open to a limited number of students (6 to 8 per session) who are pursuing degrees in VLA and autonomous driving [6]. - Students are expected to have a foundational understanding of deep learning, Python, and PyTorch, with additional support provided for those needing to strengthen their basics [12][14]. Group 4: Course Structure and Outcomes - The course spans 12 weeks of online group research followed by 2 weeks of paper guidance, culminating in a maintenance period for the research paper [11]. - Participants will produce a draft of a research paper, receive project completion certificates, and may obtain recommendation letters based on their performance [15].
传统规控和端到端岗位的博弈......(附招聘)
自动驾驶之心· 2025-07-10 03:03
Core Viewpoint - The article discusses the impact of end-to-end autonomous driving technology on traditional rule-based control (PNC) methods, highlighting the shift towards data-driven approaches and the complementary relationship between the two systems [2][6]. Summary by Sections Differences Between Approaches - Traditional PNC relies on manually coded rules and logic for vehicle planning and control, utilizing algorithms like PID, LQR, and various path planning methods. Its advantages include clear algorithms and strong interpretability, suitable for stable applications [4]. - End-to-end algorithms aim to directly map raw sensor data to control commands, reducing system complexity and enabling the model to learn human driving behavior through large-scale data training. This approach allows for joint optimization of the entire driving process [4]. Advantages and Disadvantages - **End-to-End Approach**: - Advantages include reduced system complexity, natural driving style emulation, and minimized information loss between modules [4]. - Disadvantages involve challenges in traceability of decision processes, high data scale requirements, and the need for rule-based fallback in extreme scenarios [4]. - **PNC Approach**: - Advantages include clear module functions, ease of debugging, and stable performance in known scenarios, making it suitable for high safety requirements [5]. - Disadvantages consist of high development costs and potential difficulties in handling complex scenarios without suitable rules [5]. Complementary Relationship - The analysis indicates that end-to-end systems require PNC for certain scenarios, while PNC can benefit from the efficiencies of end-to-end approaches. This suggests a complementary rather than adversarial relationship between the two methodologies [6]. Job Opportunities - The article highlights job openings in both end-to-end and traditional PNC roles, indicating a demand for skilled professionals in these areas with competitive salaries ranging from 30k to 100k per month depending on the position and location [8][10][12][14].