自动驾驶之心
Search documents
认知驱动下的小米智驾,从端到端、世界模型再到VLA......
自动驾驶之心· 2025-11-24 00:03
Core Viewpoint - Xiaomi is making significant investments in intelligent driving technology, focusing on safety, comfort, and efficiency, with safety being the top priority in their development strategy [4][7]. Development Progress - Xiaomi's intelligent driving has progressed through several versions: from high-precision maps for highway NOA (version 24.3) to urban NOA (version 24.5), and moving towards light map and no map versions (version 24.10) [7]. - The company is advancing through three stages of intelligent driving: 1.0 (rule-driven), 2.0 (data-driven), and 3.0 (cognitive-driven), with a focus on VLA (Vision Language Architecture) for the next production phase [7][10]. World Model Features - The world model introduced by Xiaomi has three essential characteristics: diversity in generated scenarios, multimodal input and output, and interactive capabilities that influence vehicle behavior [8][9]. - The world model is designed to enhance model performance through cloud-based data generation, closed-loop simulation, and reinforcement learning, rather than direct action outputs from the vehicle [10]. VLA and Learning Models - VLA is described as an enhancement over end-to-end learning, integrating high-level human knowledge (traffic rules, values) into the driving model [13]. - Xiaomi's development roadmap includes various model training stages, from LLM pre-training to embodied pre-training, with recent advancements in MiMo and MiMo-vl models [13]. Community and Knowledge Sharing - The "Automated Driving Heart Knowledge Sphere" community aims to provide a comprehensive platform for learning and sharing knowledge in the field of autonomous driving, with over 4,000 members and plans to expand [15][26]. - The community offers resources such as technical routes, video tutorials, and Q&A sessions to assist both beginners and advanced learners in the autonomous driving sector [27][30].
端到端量产这件「小事」,做过的人才知道有多痛
自动驾驶之心· 2025-11-24 00:03
Core Insights - The article emphasizes the growing demand for end-to-end production talent in the automotive industry, highlighting a paradox where job seekers are abundant, yet companies struggle to find qualified candidates [1][3]. Course Overview - A newly designed end-to-end production course aims to address the skills gap in the industry, focusing on practical applications and real-world scenarios over three months [3][5]. - The course covers essential algorithms such as one-stage and two-stage end-to-end frameworks, reinforcement learning applications, and trajectory optimization techniques [5][10]. Course Content - **Chapter 1: Overview of End-to-End Tasks** - Discusses the integration of perception tasks and the learning-based control algorithms that are becoming mainstream in autonomous driving [10]. - **Chapter 2: Two-Stage End-to-End Algorithms** - Introduces the two-stage framework, its modeling methods, and the flow of information between perception and planning [11]. - **Chapter 3: One-Stage End-to-End Algorithms** - Focuses on one-stage frameworks that allow for lossless information transfer, enhancing performance compared to two-stage methods [12]. - **Chapter 4: Application of Navigation Information** - Explains the critical role of navigation data in autonomous driving and how it can be effectively integrated into end-to-end models [13]. - **Chapter 5: Introduction to Reinforcement Learning Algorithms** - Highlights the necessity of reinforcement learning to complement imitation learning, enabling machines to generalize better [14]. - **Chapter 6: Trajectory Output Optimization** - Covers practical projects involving imitation learning and reinforcement learning algorithms for trajectory planning [15]. - **Chapter 7: Contingency Planning - Spatiotemporal Joint Planning** - Discusses post-processing logic to ensure reliable trajectory outputs, including smoothing algorithms [16]. - **Chapter 8: Experience Sharing in End-to-End Production** - Provides insights on practical strategies and tools for enhancing system capabilities in real-world applications [17]. Target Audience - The course is designed for advanced learners with a foundational understanding of autonomous driving algorithms, reinforcement learning, and programming skills [18][19]. Course Schedule - The course is set to begin on November 30, with a structured timeline for unlocking chapters and providing support through offline videos and online Q&A sessions [20].
在地平线搞自动驾驶的这三年
自动驾驶之心· 2025-11-24 00:03
Core Insights - The article discusses the transition from autonomous driving to embodied intelligence, highlighting the differences in challenges and solutions between the two fields [2] - It emphasizes the importance of documenting past experiences in autonomous driving, even if they did not receive widespread attention, as they may provide practical insights for others in the field [2] Research Areas Summary - **Sparse4D Series**: A multi-sensor fusion perception framework that challenges the conventional BEV (Bird's Eye View) approach, arguing that it does not significantly enhance information while incurring high computational costs. The Sparse4D series aims to achieve efficient perception through sparse queries and projections [6][7] - **SparseDrive**: An attempt to extend the capabilities of the Sparse4D model into end-to-end planning, integrating online mapping and motion planning tasks. It successfully executed five tasks, including detection and tracking, but faced challenges in closed-loop performance evaluation [13][15] - **EDA & UniMM**: EDA introduces a dynamic anchor strategy for trajectory prediction, improving model convergence and accuracy. UniMM unifies existing traffic flow simulation models, addressing key performance factors in agent simulation [16][20] - **DriveCamSim**: A sensor simulation system designed to evaluate autonomous driving models efficiently. It focuses on generating sensor signals with high fidelity and controllability, addressing the limitations of traditional physical engine-based simulations [22][24] - **LATR**: A foundational model for intelligent driving that leverages large datasets for unsupervised training, aiming to understand the semantics of driving scenarios. It integrates multiple tasks into a unified framework, demonstrating effective performance across various driving tasks [26][27] Conclusion and Future Outlook - The seven modules discussed form the core link of the autonomous driving system, indicating a correct technological path. The industry is moving towards maturity in end-to-end models, with significant performance improvements for companies adopting these approaches. Future developments should focus on efficient evaluation systems and the potential of reinforcement learning to enhance model performance [30][31]
简历直推 | 驭势科技招聘规划算法工程师!
自动驾驶之心· 2025-11-24 00:03
Core Insights - The article discusses the advancements in autonomous driving technology, particularly focusing on the development and implementation of VLA (Vehicle-Like Action) systems, highlighting the transition from perception-based approaches to VLA-based methodologies [14]. Group 1: VLA Development - The article reflects on the evolution of VLA technology over the past year, noting a shift from academic research to practical applications in the industry, culminating in the announcement of Xiaopeng's VLA 2.0 [14]. - It emphasizes the importance of VLA as a means to enhance the driving experience by mimicking human-like decision-making processes, akin to a "sixth sense" in driving [14]. Group 2: Research and Collaboration - The article mentions collaborative research efforts, such as the paper from The Chinese University of Hong Kong (Shenzhen) and Didi, which proposes a method for efficient reconstruction of dynamic driving scenes [14]. - It highlights the significance of ongoing discussions and knowledge sharing within the autonomous driving community, as seen in the roundtable discussions featuring industry experts [14].
港科广LiSTAR:自动驾驶4D LiDAR世界模型!
自动驾驶之心· 2025-11-23 02:04
Group 1 - The article discusses the challenges in generating high-fidelity 4D LiDAR data for autonomous driving simulations, highlighting issues with sensor characteristics, data sparsity, and controllability [2][4][8] - It introduces a novel hybrid cylindrical-spherical (HCS) coordinate voxelization method that addresses the inherent defects of Cartesian coordinates, allowing for efficient 4D data encoding while preserving geometric details [9] - The article presents the Ray-Centric World Models for 4D LiDAR sequences, emphasizing the importance of spatiotemporal attention mechanisms in modeling LiDAR sequences and ensuring temporal coherence [10][12] Group 2 - The MaskSTART framework is proposed for precise scene synthesis, utilizing a 4D point cloud alignment voxel layout as conditional input to enhance control over scene structure [12][20] - Experimental results demonstrate significant improvements in reconstruction, prediction, and generation tasks using the proposed methods, with metrics showing a 32% increase in IoU and a 60% reduction in MMD compared to baseline methods [21][22][28] - Ablation studies validate the effectiveness of the HCS coordinate system and the collaborative value of the spatial ray attention (SRA) and causal spatiotemporal attention (CSTA) modules in enhancing model performance [30][31]
自动驾驶之心企业服务与咨询正式推出!
自动驾驶之心· 2025-11-23 02:04
点击下方 卡片 ,关注" 自动驾驶之心 "公众号 戳我-> 领取 自动驾驶近30个 方向 学习 路线 自动驾驶之心企业服务与咨询正式推出啦! 合作客户 平台目前已经和国内多个高校、职业院校、Tier1、主机厂、具身机器人公司建立了企业合作,我们期望能够 触达更多需要升级的公司,推动领域的进步。 联系我们 欢迎添加负责人微信oooops-life,做进一步咨询。 品牌宣传; 产业咨询; 技术培训; 团队升级; 创业前两年,团队一直在专注C端市场,为领域研发了近50门自驾&具身课程。虽谈不上完美,但也提供了很 多学习、求职和工作的资源。今年初始,我们陆续收到了很多企业的业务需求,特别是品牌宣传、技术咨询、 培训、团队升级等服务。 目前自动驾驶之心已经积累了近3年的行业咨询、培训经验,储备了大量的专家人才库,全平台粉丝近40w。 现正式对外推出企业服务,包括但不限于: 我们将助力技术路线的升级、团队人员的升级、提供更多决策参考。 ...
大模型技术学习过程梳理:Agent、RAG、通用大模型等......
自动驾驶之心· 2025-11-23 02:04
点击下方 卡片 ,关注" 大模型之心Tech "公众号 戳我-> 领取大模型巨卷干货 做大模型社区也有几个月的时间了,柱哥最近也和不少同学交流了心得。 很多刚研一或者直博的同学非常焦虑,本科学的内容完全用不上。 上来就被transformer、Lora、多模态大模 型、Agent唬的一愣一愣的,接触的深度学习框架也往往不知从何入手。 这时候是最容易迷茫和焦虑的,实验室如果没人交流更是雪上加霜。近期我也和社区内部的同学开了一个小范 围的交流会,一些同学能从我们分享中抓到关键的部分,跟着社区里面的路线进步较快。有前沿的文章速递, 一些工具使用的配套介绍,也有行业的新闻动态等等。基础不错的同学已经可以顺利微调自己的大模型。 但还有相当多的同学卡住了,比如算力的问题,自建数据集的问题,还有模型优化、项目实战的问题等。关于 算力,前面分享过很多轻量化的方法,也能做出不错的性能,甚至SOTA,这能够适配一些算力不足的同学。 以上为我们的大模型社区:大模型之心tech知识星球的分享,也欢迎更多需要入门进阶的同学加入我们的社 区。近一年的搭建,社区内已经完成了技术路线分享、直播、问答、求职、赛事等多个版块的分享。实现了产 业 ...
国内某头部Tier1 拟投资某高阶智驾公司......
自动驾驶之心· 2025-11-23 02:04
Core Viewpoint - The article discusses the strategic investment intentions between a leading domestic Tier 1 automotive supplier and a high-level autonomous driving company, indicating a trend of deep integration within the autonomous driving industry, moving from traditional procurement relationships to strategic, capital, and technological partnerships [5][10]. Group 1: Company Overview - The Tier 1 company originates from Central Europe and has become a leading automotive electronic system supplier in China, covering smart cockpits, intelligent driving, and connected services [8]. - The company has seen significant revenue growth due to the wave of automotive intelligence, with projections indicating revenue will exceed 30 billion yuan by 2025 [8]. - Despite revenue growth, the gross margins for smart cockpits and intelligent driving have been declining from 2021 to 2024, highlighting challenges in the competitive landscape [8]. Group 2: Competitive Landscape - The company faces increasing competition as automakers like Xiaopeng begin to develop their own domain controllers and foundational software, leading to a trend of "soft and hard integration" [8]. - The Tier 1 company has historically partnered closely with a leading autonomous driving company, but has struggled with algorithm capabilities, often playing a supporting role in collaborations [8][9]. Group 3: Strategic Moves - Recognizing the need to enhance its control over algorithms and software, the Tier 1 company has made significant investments to attract a top algorithm team from SAIC, although progress has been limited [9]. - The company is also pursuing financial investments in promising autonomous driving algorithm firms, notably selecting a rising competitor, Company D, which has been aggressive in its technological approach [9]. Group 4: Industry Trends - The investment signals a deeper integration phase in the autonomous driving supply chain, with a shift towards a "strategic + capital + technology" model among automakers, Tier 1 suppliers, and autonomous driving companies [10]. - This triad model is becoming standard in the industry, where Tier 1 suppliers handle hardware and system integration, while autonomous driving companies provide core algorithms and software [10]. - As these collaborations progress, the industry concentration in the autonomous driving supply chain is expected to increase, with leading Tier 1 suppliers and algorithm firms gaining more strength in securing orders and expanding market share [10].
基于Qwen3-VL的自动驾驶场景实测
自动驾驶之心· 2025-11-22 02:01
Core Insights - The article discusses the potential of multimodal large models in the autonomous driving sector, particularly focusing on Alibaba's Qwen3-VL model, which demonstrates strong capabilities in scene understanding, spatial reasoning, behavior judgment, and risk prediction [2]. Scene Understanding and Spatial Reasoning - The Qwen3-VL model was tested on various scenarios, showcasing its ability to describe images, assess weather conditions, identify road types, and detect pedestrians or vehicles [5][7][10][11]. - The model can analyze complex traffic situations, such as determining the closest vehicle and its movement status, as well as the intentions of vehicles in adjacent lanes [21][22][23][25][26]. Behavior Decision-Making and Causal Reasoning - The model can evaluate whether the vehicle should accelerate, decelerate, or maintain speed based on current conditions, and identify potential dangers in the environment [28][29][30]. - It can also interpret traffic signs and suggest appropriate actions, emphasizing the importance of recognizing warning signs and responding accordingly [31][32][34]. Deep Thinking and Risk Assessment - The article emphasizes the need for deep analysis of traffic participants based on their dynamic states, distances, and potential risks, leading to a ranking of danger levels among vehicles [40][42]. - The Qwen3-VL model can assess the risk of nearby vehicles, particularly in low visibility conditions, and provide safety recommendations for driving maneuvers such as overtaking [44][46][48][50]. Traffic Flow Dynamics - The article outlines the evolution of traffic flow from smooth to congested states, highlighting the critical role of disturbances that can trigger congestion, such as sudden braking or road obstructions [60][62]. - It discusses the mechanisms of congestion propagation and the importance of maintaining safe distances and speeds to prevent accidents during high-density traffic situations [66][68].
世界模型能够从根本上解决VLA系统对数据的依赖,是伪命题...
自动驾驶之心· 2025-11-22 02:01
Core Viewpoint - The article discusses the ongoing debate between two approaches in the autonomous driving sector: the VLA (Vision-Language Action) route favored by companies like Xiaopeng, Li Auto, and Yuanrong Qixing, and the World Model (WA) approach promoted by Huawei and NIO. It argues that the WA approach is fundamentally flawed as it relies heavily on data, which is a critical asset in the industry [2][3]. Summary by Sections VLA vs. WA - The VLA approach leverages vast amounts of real-world data to enhance reasoning capabilities, while the WA model seeks to reduce reliance on real data by using simulated data to expand its capabilities. However, the article posits that both approaches are fundamentally about how data is utilized rather than whether data is necessary [2][3]. Data Dependency - Both VLA and WA are built on the premise that "data determines the ceiling" of capabilities. VLA relies on multi-modal data from real scenarios, while WA requires a combination of real and simulated data to enhance its generalization ability. The industry often confuses the "form of data" with its "essence," leading to misconceptions about the role of data in autonomous driving [3]. Industry Insights - The article emphasizes that the real challenge is not whether to depend on data, but how to efficiently utilize it. It highlights that before true artificial intelligence is realized, data will remain the core competitive advantage in the autonomous driving industry [3]. Community and Learning Resources - The article promotes a community platform for knowledge sharing among industry professionals and academics, offering resources such as learning routes, technical discussions, and job opportunities in the autonomous driving field [8][9][18]. Technical Learning and Development - The community provides a comprehensive set of learning materials covering over 40 technical directions in autonomous driving, including VLA, multi-modal models, and various simulation tools, aimed at both beginners and advanced practitioners [19][39]. Networking Opportunities - The platform facilitates networking opportunities with industry leaders and experts, allowing members to engage in discussions about trends, technologies, and career development in the autonomous driving sector [22][92].