Workflow
自动驾驶之心
icon
Search documents
ICCV'25港科大“先推理,后预测”:引入奖励驱动的意图推理,让轨迹预测告别黑箱!
自动驾驶之心· 2025-08-29 03:08
Core Insights - The article emphasizes the importance of accurately predicting the motion of road agents for the safety of autonomous driving, introducing a reward-driven intent reasoning mechanism to enhance trajectory prediction reliability and interpretability [3][5][10]. Summary by Sections Introduction - Trajectory prediction is a critical component of advanced autonomous driving systems, linking upstream perception with downstream planning modules. Current data-driven models often lack sufficient consideration of driving behavior, limiting their interpretability and reliability [5][10]. Methodology - The proposed method adopts a "reasoning first, then predict" strategy, where intent reasoning provides prior guidance for accurate and reliable multimodal motion prediction. The framework is structured as a Markov Decision Process (MDP) to model agent behavior [8][10][12]. - A reward-driven intent reasoning mechanism is introduced, utilizing Maximum Entropy Inverse Reinforcement Learning (MaxEnt IRL) to learn agent-specific reward distributions from demonstrations and relevant driving environments [8][9][10]. - A new query-centered IRL framework, QIRL, is developed to efficiently aggregate contextual features into a structured representation, enhancing the overall prediction performance [9][10][18]. Experiments and Results - The proposed method, referred to as FiM, is evaluated on large-scale public datasets such as Argoverse and nuScenes, demonstrating competitive performance against state-of-the-art models [28][30][32]. - In the Argoverse 1 dataset, FiM achieved a minimum average displacement error (minADE) of 0.8296 and a minimum final displacement error (minFDE) of 1.2048, outperforming several leading models [32][33]. - The results indicate that the intent reasoning module significantly enhances prediction confidence and reliability, confirming the effectiveness of the proposed framework in addressing complex motion prediction challenges [34][36]. Conclusion - The work redefines the trajectory prediction task from a planning perspective, highlighting the critical role of intent reasoning in motion prediction. The proposed framework establishes a promising baseline for future research in trajectory prediction [47].
这款手持3D激光扫描仪,爆了!
自动驾驶之心· 2025-08-29 03:08
Core Viewpoint - The article introduces the GeoScan S1, a highly cost-effective 3D laser scanner designed for industrial and research applications, emphasizing its lightweight design, ease of use, and advanced features for real-time 3D scene reconstruction. Group 1: Product Features - GeoScan S1 offers centimeter-level precision in 3D scene reconstruction using a multi-modal sensor fusion algorithm, capable of generating point clouds at a rate of 200,000 points per second and covering distances up to 70 meters [1][29]. - The device supports scanning of large areas exceeding 200,000 square meters and can be equipped with a 3D Gaussian data collection module for high-fidelity scene restoration [1][30]. - It features a user-friendly interface with one-click operation, allowing for immediate export of scanning results without complex setup [5][27]. Group 2: Technical Specifications - The GeoScan S1 integrates multiple sensors, including a high-precision IMU and RTK, and supports real-time mapping with an accuracy better than 3 cm [22][34]. - The device dimensions are 14.2 cm x 9.5 cm x 45 cm, weighing 1.3 kg without the battery and 1.9 kg with the battery, with a power consumption of 25W [22][26]. - It operates on Ubuntu 20.04 and supports various data export formats such as PCD, LAS, and PLV [22][42]. Group 3: Market Positioning - The GeoScan S1 is positioned as the most cost-effective handheld 3D laser scanner in the market, with a starting price of 19,800 yuan for the basic version [9][57]. - The product is backed by extensive research and validation from teams at Tongji University and Northwestern Polytechnical University, with over a hundred projects demonstrating its capabilities [9][38]. - The scanner is designed for various applications, including urban planning, construction monitoring, and environmental surveying, making it suitable for diverse operational environments [38][52]. Group 4: Additional Features - The GeoScan S1 supports cross-platform integration, making it compatible with drones, unmanned vehicles, and robotic systems for automated operations [44][46]. - It includes a built-in Ubuntu system and various sensor devices, enhancing its versatility and ease of use in different scenarios [3][12]. - The device is equipped with a touch screen for easy operation and monitoring during scanning tasks [22][26].
又一智能驾驶Tire 1将被收购...
自动驾驶之心· 2025-08-28 23:32
Core Viewpoint - A well-known Tier 1 automotive supplier is about to be acquired by the largest domestic map provider, with the acquisition process nearing completion [3][9]. Group 1: Acquisition Details - The acquisition is seen as mutually beneficial, with both companies having complementary strengths, particularly in hardware platform compatibility [11]. - The acquiring company has been expanding its smart driving team and actively seeking acquisition targets since 2021, aiming to transform into a smart driving Tier 1 supplier [9][11]. - The acquisition is expected to not lead to significant changes for the approximately 400 employees of the acquired company in the short term [12]. Group 2: Company Performance and Strategy - The Tier 1 supplier has made strides in securing orders for its 7V fisheye NOA solution, which utilizes the Horizon Journey 6E chip and aims for mass production by Q3 2025 [3]. - Despite initial success, the Tier 1 supplier has faced challenges, including employee dissatisfaction and high turnover among key technical staff [7]. - The company has historically focused on pure vision solutions but has struggled with project mass production, particularly in competitive bids [7]. Group 3: Market Position and Future Outlook - The acquiring company has established a comprehensive product layout integrating chips, smart cockpits, big data, and high-precision positioning technologies [9]. - The acquiring company has achieved scale in basic driving products and cabin docking products, but its high-level smart driving solutions are still lagging in development [11]. - The acquisition is anticipated to enhance the acquiring company's capabilities in high-level smart driving solutions, addressing the need for improved R&D in this area [11].
英伟达自动驾驶算法工程师面试
自动驾驶之心· 2025-08-28 23:32
Core Insights - The article discusses the competitive landscape of the autonomous driving industry, highlighting the detailed job roles and recruitment processes at companies like NV [3][4][5][6][11][12][14]. Recruitment Process - NV has a highly structured recruitment process with multiple interview rounds, including technical assessments and coding challenges [3][4][5][6][11][12]. - Candidates are evaluated on their project experiences, particularly in areas like Model Predictive Control (MPC) and Simultaneous Localization and Mapping (SLAM) [5][8][11][12]. Technical Skills - The interviews focus on advanced technical skills, including knowledge of optimization algorithms, dynamic programming, and deep learning applications in autonomous driving [5][8][11][12]. - Coding challenges often involve data structures and algorithms, such as merging linked lists and dynamic programming problems related to grid navigation [6][8][11][12]. Industry Trends - There is a noticeable trend towards standardization in the autonomous driving technology stack, with a shift from numerous specialized roles to more unified models [22][25]. - The article emphasizes the importance of community and collaboration among professionals in the autonomous driving sector to navigate the evolving landscape [22][25]. Community and Networking - The establishment of a community platform for professionals in autonomous driving is highlighted, aiming to facilitate knowledge sharing and job opportunities [19][22][25]. - The community includes members from various companies and research institutions, fostering collaboration and support for job seekers [19][22][25].
基于深度强化学习的轨迹规划
自动驾驶之心· 2025-08-28 23:32
Core Viewpoint - The article discusses the advancements and potential of reinforcement learning (RL) in the field of autonomous driving, highlighting its evolution and comparison with other learning paradigms such as supervised learning and imitation learning [4][7][8]. Summary by Sections Background - The article notes the recent industry focus on new technological paradigms like VLA and reinforcement learning, emphasizing the growing interest in RL following significant milestones in AI, such as AlphaZero and ChatGPT [4]. Supervised Learning - In autonomous driving, perception tasks like object detection are framed as supervised learning tasks, where a model is trained to map inputs to outputs using labeled data [5]. Imitation Learning - Imitation learning involves training models to replicate actions based on observed behaviors, akin to how a child learns from adults. This is a primary learning objective in end-to-end autonomous driving [6]. Reinforcement Learning - Reinforcement learning differs from imitation learning by focusing on learning through interaction with the environment, using feedback from task outcomes to optimize the model. It is particularly relevant for sequential decision-making tasks in autonomous driving [7]. Inverse Reinforcement Learning - Inverse reinforcement learning addresses the challenge of defining reward functions in complex tasks by learning from user feedback to create a reward model, which can then guide the main model's training [8]. Basic Concepts of Reinforcement Learning - Key concepts include policies, rewards, and value functions, which are essential for understanding how RL operates in autonomous driving contexts [14][15][16]. Markov Decision Process - The article explains the Markov decision process as a framework for modeling sequential tasks, which is applicable to various autonomous driving scenarios [10]. Common Algorithms - Various algorithms are discussed, including dynamic programming, Monte Carlo methods, and temporal difference learning, which are foundational to reinforcement learning [26][30]. Policy Optimization - The article differentiates between on-policy and off-policy algorithms, highlighting their respective advantages and challenges in training stability and data utilization [27][28]. Advanced Reinforcement Learning Techniques - Techniques such as DQN, TRPO, and PPO are introduced, showcasing their roles in enhancing training stability and efficiency in reinforcement learning applications [41][55]. Application in Autonomous Driving - The article emphasizes the importance of reward design and closed-loop training in autonomous driving, where the vehicle's actions influence the environment, necessitating sophisticated modeling techniques [60][61]. Conclusion - The rapid development of reinforcement learning algorithms and their application in autonomous driving is underscored, encouraging practical engagement with the technology [62].
告别高耗时!上交Prune2Drive:自动驾驶VLM裁剪利器,加速6倍性能保持
自动驾驶之心· 2025-08-28 23:32
Core Viewpoint - The article discusses the Prune2Drive framework developed by Shanghai Jiao Tong University and Shanghai AI Lab, which achieves a 6.4x acceleration in visual token processing while only reducing performance by 3% through a pruning method that eliminates 90% of visual tokens [2][3][25]. Group 1: Research Background and Challenges - Visual Language Models (VLMs) provide a unified framework for perception, reasoning, and decision-making in autonomous driving, enhancing scene understanding and reducing error propagation [2]. - The deployment of VLMs in real driving scenarios faces significant computational challenges due to the high-resolution images from multiple cameras, leading to increased inference latency and memory consumption [3]. - Existing token pruning methods are limited in adapting to multi-view scenarios, often neglecting spatial semantic diversity and the varying contributions of different camera views [4]. Group 2: Prune2Drive Framework - Prune2Drive introduces the Token-wise Farthest Point Sampling (T-FPS) mechanism, which maximizes the semantic and spatial coverage of multi-view tokens rather than relying solely on individual token significance [6]. - The T-FPS method uses cosine distance to measure semantic similarity between tokens, ensuring that selected tokens are non-redundant and semantically rich [10][11]. - A view-adaptive pruning controller is designed to optimize the pruning ratio for different views, allowing for efficient resource allocation based on the contribution of each view to driving decisions [11][12]. Group 3: Experimental Design and Results - Experiments were conducted on two multi-view VLM benchmark datasets (DriveLM, DriveLMM-o1) to validate the performance retention and efficiency improvement of Prune2Drive compared to baseline methods [16]. - The framework demonstrated that even with a 90% token reduction, it maintained a risk assessment accuracy of 68.34, outperforming several baseline models [22]. - The efficiency of Prune2Drive was highlighted by a significant speedup in processing, achieving a 6.4x acceleration in the DriveMM model and a 2.64x acceleration in the DriveLMM-o1 model [25]. Group 4: Key Findings and Advantages - Prune2Drive effectively captures critical information in driving scenarios, outperforming other methods by accurately identifying key objects in various views [26]. - The framework is plug-and-play, requiring no retraining of VLMs and compatible with efficient implementations like Flash Attention [31]. - It balances performance and efficiency, achieving substantial reductions in computational load while preserving essential semantic information [31].
小米汽车招聘云端大模型算法工程师(BEV/3DGS/OCC等方向)
自动驾驶之心· 2025-08-28 10:24
Group 1 - The article discusses a job position for a Cloud Large Model Algorithm Engineer at Xiaomi, focusing on the development and optimization of data-driven algorithms for autonomous driving [1][2]. - Responsibilities include developing generative algorithm technologies for scene and label generation, such as 4D ground truth automation labeling and multimodal large models [2]. - The role emphasizes the use of massive production data to develop unsupervised/self-supervised algorithms, enhancing the semantic understanding and spatial perception capabilities of large models [2]. Group 2 - Candidates with experience in autonomous driving projects are preferred, highlighting the importance of practical experience in the field [2]. - Required skills include solid knowledge of C++ or Python, data structures, and algorithms, along with in-depth research experience in various perception algorithms related to autonomous driving [2]. - Preferred qualifications include a background in computer science, mathematics, machine learning, robotics, or related fields, with additional experience in NeRF, 3D scene generation, and sensor simulation being advantageous [2].
自动驾驶之心业务合伙人招募来啦!模型部署/VLA/端到端方向~
自动驾驶之心· 2025-08-28 08:17
Core Viewpoint - The article emphasizes the recruitment of business partners for the autonomous driving sector, highlighting the need for expertise in various advanced technologies and offering attractive incentives for potential candidates [2][3][5]. Group 1: Recruitment Details - The company plans to recruit 10 outstanding partners for autonomous driving-related course development, research paper guidance, and hardware development [2]. - Candidates with expertise in large models, multimodal models, diffusion models, and other advanced technologies are particularly welcome [3]. - Preferred qualifications include a master's degree or higher from universities ranked within the QS200, with priority given to candidates with significant conference contributions [4]. Group 2: Incentives and Opportunities - The company offers resource sharing related to autonomous driving, including job recommendations, PhD opportunities, and study abroad guidance [5]. - Attractive cash incentives are part of the compensation package for successful candidates [5]. - Opportunities for collaboration on entrepreneurial projects are also available [5].
死磕技术的自动驾驶黄埔军校,三年了~
自动驾驶之心· 2025-08-28 03:22
Core Viewpoint - The article emphasizes the establishment of a comprehensive community for autonomous driving enthusiasts, aiming to facilitate knowledge sharing, technical discussions, and job opportunities in the field of autonomous driving and AI [1][13]. Group 1: Community Development - The "Autonomous Driving Heart Knowledge Planet" has grown to over 4,000 members, with a goal to reach nearly 10,000 in the next two years, providing a platform for exchange and technical sharing [1]. - The community offers a variety of resources, including video content, articles, learning paths, Q&A sessions, and job exchange opportunities [1][2]. Group 2: Learning Resources - The community has organized nearly 40 technical routes for members, covering various aspects of autonomous driving, including end-to-end learning, multi-modal models, and data annotation practices [2][5]. - A complete learning stack and roadmap for beginners have been prepared, making it suitable for those with no prior experience [7][9]. Group 3: Industry Insights - The community regularly invites industry leaders and experts to discuss trends in autonomous driving, technology directions, and production challenges [4][62]. - Members can engage in discussions about job opportunities, industry developments, and academic advancements, fostering a collaborative environment [59][64]. Group 4: Technical Focus Areas - Key focus areas include end-to-end autonomous driving, multi-sensor fusion, 3DGS, and NeRF technologies, with detailed resources and discussions available for each topic [31][32][33]. - The community also provides insights into the latest advancements in visual language models (VLM) and their applications in autonomous driving [35][36].
没有数据闭环的端到端只是半成品!九大议题权威解析~
自动驾驶之心· 2025-08-27 23:33
Core Viewpoint - The forum focuses on the end-to-end data closed-loop ecosystem, emphasizing the new challenges and journeys in the era of data-driven upgrades in simulation testing [4][5]. Group 1: Event Details - The "51Sim End-to-End Data Closed-Loop Ecosystem Forum" will take place on August 28, 2025, from 13:00 to 17:00 at the Shanghai World Expo Exhibition and Convention Center [1]. - The event will be held concurrently with the Testing Expo China 2025 - Automotive, and registration for free tickets is required [9]. Group 2: Key Presentations - Zhang Xiaona, General Manager of 51Sim's Vehicle Division, will discuss the new challenges and journeys in the end-to-end era of data-driven closed-loop upgrades [5]. - Experts from Great Wall Motors, Dongfeng Motor Corporation, and Beijing Automotive Research Institute will share insights on simulation testing for autonomous driving systems [5]. - Chen Shuo, Chief Engineer at China Automotive Intelligent Technology (Tianjin) Co., will present research on credibility assessment technology for compliant autonomous driving simulation testing [7]. Group 3: Company Overview - 51Sim is a leading AI synthetic data and simulation platform company established in 2017, aiming to overcome the challenges of data diversity and volume in the physical world [17]. - The company's core products include the SimOne platform for intelligent assisted driving and robotics simulation, and the DataOne platform for data closed-loop and synthetic data [17]. - 51Sim has provided comprehensive synthetic data and simulation training solutions to over a hundred global industry clients in fields such as intelligent assisted driving and robotics [17].