自动驾驶之心
Search documents
搞自驾这七年,绝大多数的「数据闭环」都是伪闭环
自动驾驶之心· 2025-12-29 09:17
Core Viewpoint - The concept of a "true data closed loop" in the autonomous driving industry is still far from realization, with most current implementations being limited to small, internal loops within individual algorithm teams rather than the comprehensive systems envisioned in early presentations [1]. Group 1: Definition of a True Data Closed Loop - A true data closed loop should automate problem discovery, allowing systems to identify anomalies from vast operational data without relying on manual feedback [4]. - The effectiveness of solutions must be quantifiable and reviewable, requiring a comprehensive trigger system that integrates real-time and historical data analysis [5]. - The system should continuously assess whether the investments in data, computing power, and development yield satisfactory results [5]. Group 2: Current Industry Practices - Many companies currently operate under a "data-driven development process with some automation tools," which are often limited to the perspectives of individual algorithm teams [8]. - Typical workflows are more about module-level, algorithmic closed loops rather than a holistic system-level approach [9]. Group 3: Challenges in Achieving True Data Closed Loops - Many existing systems are reactive rather than proactive, relying on manual identification of issues rather than automated detection [10]. - Attribution of problems is often difficult, as multiple interrelated factors contribute to issues, making it hard to pinpoint the source of a problem [12]. - The transition from data to actionable solutions often halts at the model training stage, lacking a clear connection to real-world problems [16]. - The degree of "self-healing" in current systems is limited, with many platforms resembling automated production lines rather than self-correcting systems [17]. - Organizational structures often fragment the closed loop, leading to communication issues between teams [18]. Group 4: Practical Implementation of Data Closed Loops - The company has developed a more aggressive approach to data closed loops, treating data as a product and metrics as primary citizens [24]. - The methodology emphasizes quantifying real-world pain points and ensuring all critical incidents are recorded accurately [26]. - A micro log and mini log mechanism is employed to capture high-recall, low-overhead data from vehicles, focusing on significant driving events [30]. - The system allows for dynamic control of data mining tasks based on real-time needs, ensuring flexibility in data collection [59]. Group 5: Distinction Between World Labels and Algorithm Labels - The company maintains two types of labels: world-level labels that describe the physical environment and model-level labels that reflect algorithm performance [61]. - This distinction is crucial for effective data analysis and problem-solving, ensuring that the focus remains on real-world scenarios rather than solely on algorithmic outputs [61]. Group 6: Use of Generative and Simulation Data - Generative data is utilized to address long-tail scenarios that are difficult to encounter in reality, but it is not a substitute for real-world evaluation [67]. - The company emphasizes that while recall rates may improve with generative data, the potential for increased false positives must be carefully monitored [70].
为什么世界模型对行业产生了这么大的影响?
自动驾驶之心· 2025-12-29 09:17
Core Insights - The article emphasizes the vision of world models in understanding and transforming the physical world, focusing on the continuous technological breakthroughs that lead to generative AI in autonomous driving [2] Group 1: World Model Exploration - Various companies are building their cloud and vehicle-based world models using open-source algorithms for long-tail data generation and closed-loop simulation/evaluation [4] - The exploration of world models in autonomous driving includes video generation, OCC generation, and LiDAR point cloud generation, with notable works from Wayve, OccWorld, and others [3][4] Group 2: Challenges in Understanding World Models - The definition of world models remains ambiguous, leading to confusion among newcomers in the field [5] - Many beginners struggle to grasp the concepts of data generation and closed-loop simulation, often feeling lost despite extensive efforts [6] Group 3: Course Offering - The article introduces a course on world models in autonomous driving, developed in collaboration with industry algorithm experts, aimed at helping learners understand the field from theory to practice [6][8] - The course covers various chapters, including an introduction to world models, background knowledge, discussions on general world models, and practical applications in video and OCC generation [11][12][13][14] Group 4: Course Structure and Content - The course is structured into six chapters, each focusing on different aspects of world models, including their historical development, technical stacks, and industry applications [11][12][13][14][15] - The course aims to equip participants with the necessary skills to understand and implement world models in autonomous driving, preparing them for job interviews and practical applications [16][19]
从自驾到具身:更现实的商业化路线不是一直等「完美单体」
自动驾驶之心· 2025-12-29 03:19
Core Viewpoint - The commercialization of embodied intelligence is likely to follow a systematic approach similar to that of autonomous driving, focusing on operational frameworks rather than perfect individual units [2][3][37]. Group 1: Current Developments - The emergence of unmanned logistics vehicles is seen as being on the "eve of explosion," where driving is transformed into a remote-access service, allowing for increased efficiency and reduced costs [4][5]. - The scalability of unmanned logistics benefits not only from smarter vehicles but also from better cost distribution in operations, indicating a shift towards a systematic approach [5]. Group 2: Understanding Embodied Intelligence - Embodied intelligence does not equate to humanoid robots; the focus should be on cost structure and governance rather than form [6][7]. - The commercialization of embodied intelligence will likely prioritize systems that can effectively run operations rather than those that closely resemble humans [7]. Group 3: Framework of Embodied Intelligence - The embodied intelligence system can be broken down into five layers: physical execution units, edge capabilities, cloud capabilities, remote intervention and scheduling, and operational governance with data feedback [8][10][11][12][13]. - Each layer contributes to the overall efficiency and scalability of the system, allowing for a more stable and replicable business model [13]. Group 4: Importance of NVM (One Person Covering Multiple Agents) - NVM is crucial for cost reduction, enabling one operator to manage multiple units effectively [14][15]. - Achieving NVM requires minimizing the complexity of remote operations and ensuring that on-site units handle basic tasks autonomously [15]. Group 5: Challenges in Home Services - Home services represent a complex environment for embodied intelligence due to their unstructured nature and the need for trust and privacy [16]. - The approach to home services should shift from requiring fully autonomous robots to a model of remote task-based services combined with physical execution units [16][17]. Group 6: Market Dynamics and Cloud Utilization - The commercialization of home service robots may not depend on achieving a perfect individual unit but rather on the ability to streamline operations and reduce costs [26]. - Cloud capabilities are particularly suited for home services due to their non-urgent nature, allowing for asynchronous processing and scheduling [25][30]. Group 7: Broader Applications and Future Prospects - Embodied intelligence applications extend beyond logistics to include cleaning, inspection, and public service scenarios, all sharing common characteristics that facilitate systematic implementation [21][29]. - The evolution of embodied intelligence is expected to mirror the development of mobile technology, with a focus on layered capabilities and market-driven pricing [22][24][31].
比亚迪组织架构地震!撤销第13事业部......
自动驾驶之心· 2025-12-29 03:19
Core Viewpoint - BYD's recent organizational restructuring aims to enhance efficiency and resource integration within its automotive division, focusing on solidifying its leadership in the electric vehicle sector [1][11]. Adjustment Details - The restructuring involves the dissolution of the former 13th division, which was established in 2005 and focused on automotive parts development and manufacturing, including mold design and production, lighting, and rail transit components [3][4]. - The mold business is now integrated into the Automotive Engineering Research Institute, while the lighting business has been transferred to the 11th division, enhancing the seamless connection between key components and vehicle manufacturing [4]. Strategic Intent - The adjustment is a proactive response to intensified competition in the electric vehicle market, with the core goal of "divesting non-core functions and strengthening vertical management" to improve R&D efficiency and reduce inter-departmental collaboration costs [7]. - By eliminating the 13th division, BYD aims to streamline operations, allowing for quicker alignment between technical requirements and manufacturing capabilities, thus enhancing overall productivity [7]. Market and Brand Effects - The organizational optimization is crucial for expanding into overseas markets and enhancing high-end brand development, as it allows for faster response to market demands and shorter product development cycles [10]. - The restructuring enhances the automotive division's ability to coordinate R&D across passenger and commercial vehicles, leveraging advanced technologies to create differentiated competitive advantages and elevate brand value [10]. Overall Implications - BYD's restructuring is not merely a departmental merger but a strategic choice focused on "efficiency first and core focus," paving the way for accelerated technological development, optimized cost control, and deeper overseas expansion [11].
研二上就要结束,快的人已经在准备实习了~
自动驾驶之心· 2025-12-29 03:19
这一年接触到了很多有科研需求的同学,主要有以下几个难题: 最快的提升方法则是跟着一个有经验的researcher一起工作,自动驾驶之心前面推出了1v1科研辅导业务,也欢 迎大家咨询了解。 点击下方 卡片 ,关注" 自动驾驶之心 "公众号 戳我-> 领取 自动驾驶近30个 方向 学习 路线 最近期末,有同学联系柱哥咨询明年暑期实习的事情,还比较发愁,一年半了到现在还没什么积累。后面还有 毕业小论文和大论文的事情,现在觉得时间有些紧迫感了。 这个时间点是有些尴尬的,研二下学期要做的事情会堆积在一起,尤其是研究生只有两年的同学。最近有几个 小论文已经投出去或发表的同学,柱哥也顺利的帮他们内推到了自驾的一些公司。这些公司的 要求其实并不 高,那就是"具备完整的科研能力",能对应完成和思考相应工作。如果缺乏这个,不敢轻易推荐给企业。 完整的科研能力代表能发现问题、定义问题、提出解决问题的方法、能形成方法论输出观点。这并不是简单的 读论文,很多同学都错判了这点。 主要辅导方向 端到端、VLA、世界模型、强化学习、3D目标检测、多传感器融合、3DGS、BEV感知、Occupancy Network、多任务学习、语义分割、轨 ...
市场正在惩罚只懂理论的端到端算法工程师......
自动驾驶之心· 2025-12-29 01:07
Core Insights - The article discusses the current challenges in the automotive industry regarding the recruitment of algorithm talent for end-to-end production roles, highlighting a gap between the skills of candidates and the high salary expectations for these positions [1] - A new course titled "End-to-End Practical Class for Mass Production" has been designed to address this gap, focusing on essential algorithms and practical applications in autonomous driving [1] Course Overview - The course is structured into eight chapters, covering various aspects of end-to-end algorithms, including the integration of perception tasks and learning-based control algorithms [6] - It emphasizes the importance of understanding both one-stage and two-stage end-to-end frameworks, with practical examples and real-world applications [7][8] - Key algorithms discussed include reinforcement learning, trajectory optimization, and spatial-temporal planning, which are crucial for the mass production of autonomous driving systems [10][12] Target Audience - The course is aimed at advanced learners with a foundational understanding of autonomous driving technologies, including familiarity with algorithms such as reinforcement learning and diffusion models [14][16] - It is designed to be accessible even to those with weaker foundations, as the instructor will provide guidance to help participants quickly get up to speed [14] Course Logistics - The course will commence on November 30 and is expected to last for three months, featuring offline video lectures and online Q&A sessions [14][17] - Participants are required to have a GPU with a recommended capability of 4090 or higher, along with a basic understanding of Python and PyTorch [16]
AI Day直播 | 如何解决特斯拉提出的端到端三大挑战?
自动驾驶之心· 2025-12-29 01:07
Core Insights - Tesla has identified three core challenges in autonomous driving during its presentation at ICCV2025, which have been widely discussed in both academia and industry [3][6][7] - The event features discussions on solutions to these challenges, including insights from researchers at the University of Hong Kong [3][11] Group 1: Core Challenges - The three main challenges in Tesla's end-to-end architecture for autonomous driving are dimensionality disaster, interpretability and safety guarantees, and closed-loop evaluation [6][7] - Solutions proposed include UniLION, DrivePI, and GenieDrive, which aim to address these challenges [6][13] Group 2: Technical Insights - The presentation includes a detailed explanation of Tesla's end-to-end technology evolution and FSD v14 [6][13] - The discussion will also explore the concept of a general artificial intelligence that can understand and interact with the physical world [6][13] Group 3: Additional Content - The event will provide deeper insights into the technical details, Q&A, and previously unpublished content related to autonomous driving [14] - There will be discussions on the divergence between academic research and mass production, as well as ongoing technical debates in the industry [14]
世界模型和数字孪生的本质是什么?怎么赋能自动驾驶?
自动驾驶之心· 2025-12-29 01:07
Core Viewpoint - The article discusses the essence of world models and digital twins in the context of autonomous driving, emphasizing their role in training perception models in virtual environments and applying them to real-world scenarios [5][6]. Group 1: World Models - World models are defined as the ultimate goal of modeling the physical world, focusing on "spatiotemporal cognition" and requiring vast amounts of video data for training [7]. - The development of world models is shifting from simple visual dynamics simulation to creating immersive interactive environments that reflect real-world complexities [8]. - The core consensus among researchers is that the primary purpose of world models is to understand dynamic environments and predict future scenarios [7][9]. Group 2: Applications in Autonomous Driving - In autonomous driving, world models must provide real-time perception of road conditions and accurately predict their evolution, focusing on immediate environmental awareness and complex trend forecasting [11]. - Key features of effective world models include physical consistency, multiscale spatiotemporal modeling, causal reasoning capabilities, and the ability to generate interactive environments [11]. - Various companies are implementing world models, such as NIO's NWM world model for simulation training, Xiaomi's ORION framework for integrating simulation tools, and Wayve's GAIA-1 for generative world modeling [17]. Group 3: Digital Twins - Digital twins are defined as virtual representations of physical systems that allow for low-cost, high-efficiency research on key technologies and solutions in autonomous driving [19]. - The role of digital twins extends beyond mere observation; they participate in iterative processes to enhance real-world applications [19]. - Digital twins facilitate the modeling of physical world elements in virtual spaces, enabling further work on perception models and system iterations [20][21]. Group 4: Related Technologies - Technologies such as 3D occupancy grids and point clouds are utilized to predict spatial occupancy and enhance scene understanding in autonomous driving [22]. - The integration of multimodal inputs, including visual and LiDAR data, is crucial for improving depth estimation and overall perception accuracy [92]. - The article highlights the importance of self-supervised learning techniques in enhancing the efficiency of 3D scene reconstruction and semantic labeling in autonomous driving applications [90][91].
理想汽车又一核心骨干将离职
自动驾驶之心· 2025-12-28 09:23
点击下方 卡片 ,关注" 自动驾驶之心 "公众号 戳我-> 领取 自动驾驶近30个 方向 学习 路线 >>自动驾驶前沿信息获取 → 自动驾驶之心知识星球 据晚点Auto消息,理想汽车第二产品线总裁张骁将于近期离职。 2015年7月理想汽车成立,张晓于16年5月加 入担任整车产品经理,算是很早期的核心员工。在理想期间,张骁深度参与了理想 ONE、L9 等车型的产品定 义,与 CEO 李想合作密切;他也曾在高级副总裁范皓宇团队内负责理想车型产品的型谱规划。 而据一见Auto信息,前两周理想内部刚对供应链相关部门进行了组织架构调整和合并,将原来智能汽车群组下 属的一级部门"零部件集群"并入"制造",统一由理想汽车副总裁李斌管理,李斌向总裁马东辉汇报,原零部件 部门负责人罗屏已离职。 据悉,张骁离职可能是要创业。作为理想进入纯电 SUV 领域后推出的首款车型,i8 延期一年上市,从去年年 中到今年上半年,张骁带队完成了理想 i8 的造型方案调整与产品设计优化。但把产品得失归于一人,总是不 适合的。 柱哥在十月份的时候试乘过理想i6,产品还是很能打的。可以预见,理想的困境还将持续一段时间。柱哥也和 理想的一些同学沟通过 ...
DiffusionDriveV2核心代码解析
自动驾驶之心· 2025-12-28 09:23
Core Viewpoint - The article discusses the DiffusionDrive model, which utilizes a truncated diffusion approach for end-to-end autonomous driving, emphasizing its architecture and the integration of reinforcement learning to enhance trajectory planning and safety [1]. Group 1: Model Architecture - DiffusionDriveV2 employs a reinforcement learning-constrained truncated diffusion model, focusing on the overall architecture for autonomous driving [3]. - The model incorporates environment encoding, including bird's-eye view (BEV) features and vehicle status, to enhance the understanding of the driving context [5]. - The trajectory planning module utilizes multi-scale BEV features to improve the accuracy of trajectory predictions [8]. Group 2: Trajectory Generation - The model generates trajectories by first clustering the true future trajectories of the vehicle using K-Means to create anchors, which are then perturbed with Gaussian noise [12]. - The trajectory prediction process involves cross-attention mechanisms between the trajectory features and BEV features, allowing for more accurate trajectory generation [15][17]. - The model also integrates time encoding to enhance the temporal aspect of trajectory predictions [14]. Group 3: Reinforcement Learning Integration - The Intra-Anchor GRPO method is proposed to optimize strategies within specific behavior intentions, enhancing safety and goal-oriented trajectory generation [27]. - The reinforcement learning loss function is designed to mitigate instability during early denoising steps, using a discount factor to adjust the influence of rewards over time [28]. - The model incorporates a clear learning signal by truncating negative advantages and applying strong penalties for collisions, ensuring safer trajectory outputs [30]. Group 4: Noise Management - The model introduces multiplicative noise rather than additive noise to maintain the structural integrity of trajectories, ensuring smoother exploration paths [33]. - This approach addresses the inherent scale inconsistencies in trajectory segments, allowing for more coherent and realistic trajectory generation [35]. Group 5: Evaluation Metrics - The model evaluates generated trajectories based on safety, comfort, rule compliance, progress, and feasibility, aggregating these into a comprehensive score [27]. - Specific metrics are employed to assess safety (collision detection), comfort (acceleration and curvature), and adherence to traffic rules, ensuring a holistic evaluation of trajectory performance [27].