自动驾驶之心 - filings, earnings calls, financial reports, news - Reportify

自动驾驶之心

Search documents

业务合伙人招募来啦！模型部署/VLA/端到端方向~

自动驾驶之心· 2025-09-02 03:14

Group 1 - The article announces the recruitment of 10 partners for the autonomous driving sector, focusing on course development, research guidance, and hardware development [2][5] - The recruitment targets individuals with expertise in various advanced models and technologies related to autonomous driving, such as large models, multimodal models, and 3D target detection [3] - Candidates are preferred from QS top 200 universities with a master's degree or higher, especially those with significant conference contributions [4] Group 2 - The company offers benefits including resource sharing for job seeking, PhD recommendations, and study abroad opportunities, along with substantial cash incentives [5] - There are opportunities for collaboration on entrepreneurial projects [5] - Interested parties are encouraged to contact the company via WeChat for further inquiries [6]

大模型部署与量化感知推理

多模态大模型

大模型部署与量化感知推理

多模态大模型

4000人的自动驾驶社区，开学季招生了！！！

自动驾驶之心· 2025-09-02 03:14

Core Viewpoint - The article emphasizes the establishment of a comprehensive community focused on autonomous driving technology, aiming to provide valuable resources and networking opportunities for both beginners and advanced learners in the field [1][3][12]. Group 1: Community Structure and Offerings - The community has been focusing on nearly 40 cutting-edge technology directions in autonomous driving, including multimodal large models, VLM, VLA, closed-loop simulation, world models, and sensor fusion [1][3]. - The community consists of members from leading autonomous driving companies, top academic laboratories, and traditional robotics firms, creating a complementary dynamic between industry and academia [1][12]. - The community has over 4,000 members and aims to grow to nearly 10,000 within two years, serving as a hub for technical sharing and communication [3][12]. Group 2: Learning and Development Resources - The community provides a variety of resources, including video content, articles, learning paths, and Q&A sessions, to assist members in their learning journey [3][12]. - It has organized nearly 40 technical routes for members, covering various aspects of autonomous driving, from entry-level to advanced topics [3][12]. - Members can access practical solutions to common questions, such as how to start with end-to-end autonomous driving and the learning paths for multimodal large models [3][12]. Group 3: Networking and Career Opportunities - The community facilitates job referrals and connections with various autonomous driving companies, enhancing members' employment opportunities [8][12]. - Regular discussions with industry leaders and experts are held to explore trends, technological directions, and challenges in mass production [4][12]. - Members are encouraged to engage with each other to discuss academic and engineering-related questions, fostering a collaborative environment [12][54]. Group 4: Technical Focus Areas - The community has compiled extensive resources on various technical areas, including 3DGS, NeRF, world models, and VLA, providing insights into the latest research and applications [12][27][31]. - Specific learning paths are available for different aspects of autonomous driving, such as perception, simulation, and planning control [12][13]. - The community also offers a detailed overview of open-source projects and datasets relevant to autonomous driving, aiding members in practical applications [24][25].

多模态大模型

端到端自动驾驶

自动驾驶多模态大模型

视觉语言模型（VLM）

多模态大模型

端到端自动驾驶

自动驾驶多模态大模型

视觉语言模型（VLM）

ICLR 2025 | SmODE：用于生成平滑控制动作的常微分方程神经网络

自动驾驶之心· 2025-09-01 23:32

Core Viewpoint - The research team led by Professor Li Shengbo from Tsinghua University has developed a novel smoothing neural network called SmODE, which utilizes ordinary differential equations (ODE) to enhance the smoothness of control actions in reinforcement learning tasks, thereby improving the usability and safety of intelligent systems [4][23]. Background - Deep Reinforcement Learning (DRL) has proven effective in solving optimal control problems in various applications, including drone control and autonomous driving. However, the smoothness of control actions remains a significant challenge due to high-frequency noise and unregulated Lipschitz constants in neural networks [5][19]. Key Technologies of SmODE - **Smoothing ODE Design**: The team designed a smoothing neuron structure based on ODEs that can adaptively filter high-frequency noise while controlling the Lipschitz constant, thus enhancing the performance of control systems [8][9]. - **Smoothing Network Structure**: SmODE is structured to be integrated into various reinforcement learning frameworks, featuring an input module, a smoothing ODE module, and an output module, which can be adjusted based on task complexity [14][16]. - **Reinforcement Learning Algorithm Based on SmODE**: SmODE can be easily combined with existing deep reinforcement learning algorithms, requiring additional loss terms to regulate the time constant and Lipschitz constant during training [16][17]. Experimental Results - In experiments with Gaussian noise variance set at 0.05, SmODE demonstrated significantly lower action volatility compared to traditional MLP networks, enhancing vehicle comfort and safety during tasks such as sine curve tracking and lane changing [19][21]. - In the MuJoCo benchmark tests, SmODE outperformed other networks (LTC, LipsNet, and MLP) in terms of average action smoothness across various tasks, indicating its effectiveness in real-world applications [21][22]. Conclusion - The SmODE network effectively addresses the oscillation issues in action outputs within deep reinforcement learning, providing a new approach to enhance the performance and stability of intelligent systems in real-world applications [23].

深度强化学习

常微分方程神经网络

SmODE（Smooth Ordinary Differential Equations）

深度强化学习

常微分方程神经网络

SmODE（Smooth Ordinary Differential Equations）

超高性价比3D扫描仪！点云/视觉全场景重建，高精厘米级重建

自动驾驶之心· 2025-09-01 23:32

GeoScan S1设备自带手持Ubuntu系统和多种传感器设备，手柄集成了电源，可通过D-TAP转XT30母头输出至 GeoScan S1设备本体，给雷达、摄像头以及主控板提供电源。基础版重建效果一览！最强性价比3D激光扫描仪面向工业场景和教研场景的超高性价比3D扫描仪来了！GeoScan S1是国内目前最强性价比实景三维激光扫描仪，轻量化设计，一键启动，便可拥有高效实用的三维解决方案。以多模态传感器融合算法为核心，实现厘米级精度的三维场景实时重构。可广泛用于多种作业领域。每秒20万级点云成图，70米测量距离，360°全域覆盖，支持20万平米以上的大场景，扫描可选配3D高斯数据采集模块，实现高保真实景还原。支持跨平台集成，配备高带宽网口及双USB 3.0接口，为科研实验提供灵活扩展空间。降低开发门槛，助力开发者快速掌握研发能力，开启更多可能。使用门槛低：操作简单直观，一键启动即可执行扫描作业扫描结果导出即用：无需复杂部署和繁琐处理，扫描结果导出即用高效率高精度建图：模型精度高，行走之间轻松扫描大场景业内最优惠价格：性价比高，高度集成多传感器，往下翻~ 重磅！3DG ...

GeoScan S1手持三维激光扫描仪

GeoScan S1手持三维激光扫描仪

后端到端时代：我们必须寻找新的道路吗？

自动驾驶之心· 2025-09-01 23:32

Core Viewpoint - The article discusses the evolution of autonomous driving technology, particularly focusing on the transition from end-to-end systems to Vision-Language-Action (VLA) models, highlighting the differing approaches and perspectives within the industry regarding these technologies [6][32][34]. Group 1: VLA and Its Implications - VLA, or Vision-Language-Action Model, aims to integrate visual perception and natural language processing to enhance decision-making in autonomous driving systems [9][10]. - The VLA model attempts to map human driving instincts into interpretable language commands, which are then converted into machine actions, potentially offering both strong integration and improved explainability [10][19]. - Companies like Wayve are leading the exploration of VLA, with their LINGO series demonstrating the ability to combine natural language with driving actions, allowing for real-time interaction and explanations of driving decisions [12][18]. Group 2: Industry Perspectives and Divergence - The current landscape of autonomous driving is characterized by a divergence in approaches, with some teams embracing VLA while others remain skeptical, preferring to focus on traditional Vision-Action (VA) models [5][6][19]. - Major players like Huawei and Horizon have expressed reservations about VLA, opting instead to refine existing VA models, which they believe can still achieve effective results without the complexities introduced by language processing [5][21][25]. - The skepticism surrounding VLA stems from concerns about the ambiguity and imprecision of natural language in driving contexts, which can lead to challenges in real-time decision-making [19][21][23]. Group 3: Technical Challenges and Considerations - VLA models face significant technical challenges, including high computational demands and potential latency issues, which are critical in scenarios requiring immediate responses [21][22]. - The integration of language processing into driving systems may introduce noise and ambiguity, complicating the training and operational phases of VLA models [19][23]. - Companies are exploring various strategies to mitigate these challenges, such as enhancing computational power or refining data collection methods to ensure that language inputs align effectively with driving actions [22][34]. Group 4: Future Directions and Industry Outlook - The article suggests that the future of autonomous driving may not solely rely on new technologies like VLA but also on improving existing systems and methodologies to ensure stability and reliability [34]. - As the industry evolves, companies will need to determine whether to pursue innovative paths with VLA or to solidify their existing frameworks, each offering unique opportunities and challenges [34].

多模态模型

多模态模型

端到端自动驾驶的万字总结：拆解三大技术路线（UniAD/GenAD/Hydra MDP）

自动驾驶之心· 2025-09-01 23:32

Core Viewpoint - The article discusses the current development status of end-to-end autonomous driving algorithms, comparing them with traditional algorithms and highlighting their advantages and limitations [3][5][6]. Group 1: Traditional vs. End-to-End Algorithms - Traditional autonomous driving algorithms follow a pipeline of perception, prediction, and planning, where each module has distinct inputs and outputs [5][6]. - The perception module takes sensor data as input and outputs bounding boxes for the prediction module, which then outputs trajectories for the planning module [6]. - End-to-end algorithms, in contrast, take raw sensor data as input and directly output path points, simplifying the process and reducing error accumulation [6][10]. Group 2: Limitations of End-to-End Algorithms - End-to-end algorithms face challenges such as lack of interpretability, safety guarantees, and issues related to causal confusion [12][57]. - The reliance on imitation learning in end-to-end algorithms limits their ability to handle corner cases effectively, as they may misinterpret rare scenarios as noise [11][57]. - The inherent noise in ground truth data can lead to suboptimal learning outcomes, as human driving data may not represent the best possible actions [11][57]. Group 3: Current End-to-End Algorithm Implementations - The ST-P3 algorithm is highlighted as an early example of end-to-end autonomous driving, focusing on spatiotemporal learning with three core modules: perception, prediction, and planning [14][15]. - Innovations in ST-P3 include a perception module that uses a self-centered cumulative alignment technique, a dual-path prediction mechanism, and a planning module that incorporates prior information for trajectory optimization [15][19][20]. Group 4: Advanced Techniques in End-to-End Algorithms - The UniAD framework introduces a multi-task approach by incorporating five auxiliary tasks to enhance performance, addressing the limitations of traditional modular stacking methods [24][25]. - The system employs a full Transformer architecture for planning, integrating various interaction modules to improve trajectory prediction and planning accuracy [26][29]. - The VAD (Vectorized Autonomous Driving) method utilizes vectorized representations to better express structural information of map elements, enhancing computational speed and efficiency [32][33]. Group 5: Future Directions and Challenges - The article emphasizes the need for further research to overcome the limitations of current end-to-end algorithms, particularly in optimizing learning processes and handling exceptional cases [57]. - The introduction of multi-modal planning and multi-model learning approaches aims to improve trajectory prediction stability and performance [56][57].

端到端自动驾驶

多模态规划

端到端自动驾驶算法

端到端自动驾驶

多模态规划

端到端自动驾驶算法

驾驭多模态！自动驾驶多传感器融合感知1v6小班课来了

自动驾驶之心· 2025-09-01 09:28

点击下方卡片，关注" 自动驾驶之心 "公众号戳我-> 领取自动驾驶近30个方向学习路线随着自动驾驶、机器人导航和智能监控等领域的快速发展，单一传感器（如摄像头、激光雷达或毫米波雷达）的感知能力已难以满足复杂场景的需求。为了克服这一瓶颈，研究者们开始将激光雷达、毫米波雷达和摄像头等多种传感器的数据进行融合，构建一个更全面、更鲁棒的环境感知系统。这种融合的核心思想是优势互补。摄像头提供丰富的语义信息和纹理细节，对车道线、交通标志等识别至关重要；激光雷达则生成高精度的三维点云，提供准确的距离和深度信息，尤其在夜间或光线不足的环境下表现优异；而毫米波雷达在恶劣天气（如雨、雾、雪）下穿透性强，能稳定探测物体的速度和距离，且成本相对较低。通过融合这些传感器，系统可以实现全天候、全场景下的可靠感知，显著提高自动驾驶的鲁棒性和安全性。当前的多模态感知融合技术正在从传统的融合方式，向更深层次的端到端融合和基于Transformer的架构演进。传统的融合方式主要分为三种：早期融合直接在输入端拼接原始数据，但计算量巨大；中期融合则是在传感器数据经过初步特征提取后，将不同模态的特征向量进行融合，这 ...

多模态感知融合技术

端到端自动驾驶

传感器融合

视觉表征学习

自动驾驶多传感器融合感知系统

多模态感知融合技术

端到端自动驾驶

传感器融合

视觉表征学习

自动驾驶多传感器融合感知系统

研究生开学，被大老板问懵了。。。

自动驾驶之心· 2025-09-01 03:17

Core Insights - The article emphasizes the establishment of a comprehensive community focused on autonomous driving and robotics, aiming to connect learners and professionals in the field [1][14] - The community, named "Autonomous Driving Heart Knowledge Planet," has over 4,000 members and aims to grow to nearly 10,000 in two years, providing resources for both beginners and advanced learners [1][14] - Various technical learning paths and resources are available, including over 40 technical routes and numerous Q&A sessions with industry experts [3][5] Summary by Sections Community and Resources - The community offers a blend of video, text, learning paths, and Q&A, making it a comprehensive platform for knowledge sharing [1][14] - Members can access a wealth of information on topics such as end-to-end autonomous driving, multi-modal large models, and data annotation practices [3][14] - The community has established a job referral mechanism with multiple autonomous driving companies, facilitating connections between job seekers and employers [10][14] Learning Paths and Technical Focus - The community has organized nearly 40 technical directions in autonomous driving, covering areas like perception, simulation, and planning control [5][14] - Specific learning routes are provided for beginners, including full-stack courses suitable for those with no prior experience [8][10] - Advanced topics include discussions on world models, reinforcement learning, and the integration of various sensor technologies [4][34][46] Industry Engagement and Expert Interaction - The community regularly invites industry leaders for discussions on the latest trends and challenges in autonomous driving [4][63] - Members can engage in discussions about career choices, research directions, and technical challenges, fostering a collaborative environment [60][64] - The platform aims to bridge the gap between academic research and industrial application, ensuring that members stay updated on both fronts [14][65]

端到端自动驾驶

视觉语言模型（VLM）

自动驾驶多模态大模型

自动驾驶之心知识星球

端到端自动驾驶

视觉语言模型（VLM）

自动驾驶多模态大模型

自动驾驶之心知识星球

马斯克暴论，激光雷达和毫米波雷达对自驾来说除了碍事，没有好处......

自动驾驶之心· 2025-08-31 23:33

Core Viewpoint - The article discusses the ongoing debate between the use of LiDAR and pure vision systems in autonomous driving, highlighting the differing perspectives of industry leaders like Uber's CEO Dara Khosrowshahi and Tesla's Elon Musk regarding the safety and effectiveness of these technologies [1][2][6]. Group 1: Industry Perspectives - Uber's CEO supports LiDAR for its lower cost and higher safety, while Musk criticizes it, claiming that sensor competition reduces safety [1][2]. - Baidu, a significant player in the autonomous driving sector, advocates for LiDAR, asserting it ensures driving safety and has cost advantages [2][14]. - The article emphasizes the division in the industry, with Waymo and Baidu favoring multi-sensor fusion (including LiDAR) and Tesla sticking to a pure vision approach [6][11]. Group 2: Technical Analysis - Tesla's transition from LiDAR to a pure vision system was driven by cost considerations and the belief that AI can surpass human driving capabilities using camera data alone [8][9]. - Waymo employs a multi-modal approach, integrating LiDAR, radar, and cameras, achieving L4-level autonomous driving and expanding its services in complex urban environments [11][12]. - Baidu's autonomous driving service, "萝卜快跑," utilizes a multi-sensor fusion strategy, combining LiDAR, cameras, and radar to achieve L4 capabilities, with a strong safety record [14][16]. Group 3: Performance Comparison - LiDAR systems provide high-precision 3D environmental perception, unaffected by lighting conditions, while pure vision systems struggle in adverse weather and lighting [48][49]. - The article outlines the advantages of LiDAR in terms of distance measurement accuracy, environmental adaptability, and reliable identification of static objects, contrasting these with the limitations of pure vision systems [50][51]. - LiDAR's ability to maintain performance in extreme conditions, such as heavy rain or fog, is highlighted as a critical safety feature for autonomous vehicles [34][36]. Group 4: Market Trends and Regulations - The article notes that the decreasing cost of LiDAR technology is making it more accessible for widespread adoption in high-end vehicles, with significant market players integrating it into their models [25][42]. - Regulatory frameworks are increasingly favoring the use of LiDAR in autonomous vehicles, with new standards requiring advanced sensing capabilities that LiDAR can provide [55][56]. - The collaboration between Baidu's "萝卜快跑" and Uber to deploy autonomous vehicles globally indicates a growing acceptance of multi-sensor fusion solutions in the market [18].

多传感器融合

纯视觉方案

多传感器融合

纯视觉方案

没有数据闭环的端到端只是半成品！

自动驾驶之心· 2025-08-31 23:33

Core Viewpoint - The article emphasizes the increasing investment in automated labeling by autonomous driving companies, highlighting the challenges and requirements for end-to-end automated labeling in the context of intelligent driving [1][2]. Group 1: Challenges in Automated Labeling - The main challenges in 4D automated labeling include high spatial-temporal consistency requirements, complex multi-modal data fusion, difficulties in generalizing dynamic scenes, contradictions between labeling efficiency and cost, and high requirements for scene generalization in mass production [2][3]. Group 2: Course Overview - The course offers a comprehensive tutorial on the entire process of 4D automated labeling, covering dynamic obstacle detection, SLAM reconstruction, static element labeling, and end-to-end truth generation [3][4][6]. - It includes practical exercises to enhance algorithm capabilities and addresses real-world engineering challenges [2][3]. Group 3: Detailed Course Structure - Chapter 1 introduces the basics of 4D automated labeling, its applications, and the necessary data and algorithms [4]. - Chapter 2 focuses on the process of dynamic obstacle labeling, including offline 3D target detection algorithms and solutions to common engineering issues [6]. - Chapter 3 discusses laser and visual SLAM reconstruction, explaining its importance and common algorithms [7]. - Chapter 4 addresses static element labeling based on reconstruction outputs [9]. - Chapter 5 covers the general obstacle OCC labeling, detailing the input-output requirements and optimization techniques [10]. - Chapter 6 is dedicated to end-to-end truth generation, integrating various elements into a cohesive process [12]. - Chapter 7 provides insights into data scaling laws, industry pain points, and interview preparation for relevant positions [14]. Group 4: Target Audience and Prerequisites - The course is suitable for researchers, students, and professionals looking to transition into the data closure field, requiring a foundational understanding of deep learning and autonomous driving perception algorithms [19][23].

自动驾驶数据闭环

自动驾驶4D自动标注算法就业小班课

自动驾驶数据闭环

自动驾驶4D自动标注算法就业小班课