自动驾驶之心 - filings, earnings calls, financial reports, news

自动驾驶之心

Search documents

自动驾驶之心· 2025-06-25 09:54

Core Viewpoint - The article emphasizes the establishment of a comprehensive community for autonomous driving and embodied intelligence, aiming to gather industry professionals and facilitate rapid problem-solving and knowledge sharing within the sector [2][4]. Group 1: Community Development - The goal is to create a community of 10,000 members focused on intelligent driving and embodied intelligence within three years, welcoming contributions from talented individuals [2]. - The community will serve as a bridge connecting academia, products, and recruitment, forming a closed loop in teaching and research [2][4]. - The community will provide the latest industry technology updates, technical discussions, and job sharing opportunities [2][3]. Group 2: Knowledge Sharing and Resources - The "Autonomous Driving Heart Knowledge Planet" is designed as a technical exchange platform for academic and engineering issues, attracting students and professionals from top universities and companies [4][11]. - The community has established connections with numerous companies for recruitment, including Xiaomi, Horizon, and NIO, facilitating direct resume submissions [4][11]. - Members will have access to a variety of learning modules, from basic to advanced, covering algorithm explanations and code implementations [4][11]. Group 3: Technical Focus Areas - By 2025, the focus will be on advanced technology areas such as visual large language models (VLM), end-to-end trajectory prediction, and 3D generative simulation [6][10]. - The community has developed over 30 learning pathways covering various subfields of autonomous driving, including perception, mapping, and AI model deployment [11][16]. - Regular live sessions will feature top researchers and industry experts discussing practical applications and research advancements in autonomous driving [18][19]. Group 4: Engagement and Interaction - The community encourages active participation, with weekly engagement metrics ranking among the top 20 in the country, fostering a collaborative learning environment [12]. - Members can freely ask questions and engage in discussions, enhancing their learning experience and networking opportunities [11][12]. - The platform offers exclusive rights to members, including access to academic advancements, expert Q&A, and discounts on paid courses [14].

SOTA端到端算法如何设计？CVPR'25 WOD纯视觉端到端比赛Top3技术分享~

自动驾驶之心· 2025-06-25 09:54

Core Insights - The article discusses the results of the 2025 Waymo Open Dataset End-to-End Driving Challenge, highlighting the advancements in end-to-end autonomous driving systems and the shift towards using large-scale public datasets for training models [2][18]. Group 1: Competition Results - The champion of the competition was the EPFL team, which utilized the DiffusionDrive model, nuPlan data, and an ensembling strategy [1]. - The runner-up was a collaboration between Nvidia and Tubingen teams, which also referenced DiffusionDrive and SmartRefine, employing multiple datasets to demonstrate the importance of training data quality [1][22]. - The third place was secured by Hanyang University from South Korea, which focused on a simplified structure using only front-view input and vehicle state [1][3]. Group 2: Methodology - The UniPlan framework was introduced, leveraging large-scale public driving datasets to enhance generalization in rare long-tail scenarios, achieving competitive results without relying on expensive multimodal large language models [3][18]. - The model architecture is based on DiffusionDrive, which employs a truncated diffusion strategy for efficient and diverse trajectory generation [4][6]. - The diffusion decoder utilizes a cross-attention mechanism to refine trajectory predictions based on scene context [5][6]. Group 3: Data Processing - The nuPlan dataset was processed to create a diverse training set, resulting in 90,000 samples by applying a sliding window approach [7]. - A similar filtering strategy was used for the WOD-E2E dataset, generating 35,000 training samples and 10,000 validation samples [8]. - The model was trained on a powerful computing setup with four H100 GPUs, achieving significant training efficiency [10]. Group 4: Experimental Results - The performance was evaluated using Rater Feedback Score (RFS) and Average Displacement Error (ADE), with various configurations tested [12][17]. - The results indicated that the combined training of WOD-E2E and nuPlan datasets led to slight improvements in average RFS, particularly in long-tail categories [23]. - The analysis showed that while additional datasets generally provide benefits, the quality of the data sources is more critical than quantity [39]. Group 5: Conclusion - The article emphasizes the potential of data-centric approaches in enhancing the robustness of autonomous driving systems, as demonstrated by the competitive results achieved with the UniPlan framework [18][39].

RoboSense 2025机器感知挑战赛正式启动！自动驾驶&具身方向~

自动驾驶之心· 2025-06-25 09:54

Core Viewpoint - The RoboSense Challenge 2025 aims to systematically evaluate the perception and understanding capabilities of robots in real-world scenarios, addressing key challenges in stability, robustness, and generalization of perception systems [2][43]. Group 1: Challenge Overview - The challenge consists of five major tracks focusing on real-world tasks, including language-driven autonomous driving, social navigation, sensor placement optimization, cross-modal drone navigation, and cross-platform 3D object detection [8][9][29][35]. - The event is co-hosted by several prestigious institutions and will be officially recognized at the IROS 2025 conference in Hangzhou, China [5][43]. Group 2: Task Details - **Language-Driven Autonomous Driving**: This track evaluates the ability of robots to understand and act upon natural language commands, aiming for a deep coupling of language, perception, and planning [10][11]. - **Social Navigation**: Focuses on robots navigating shared spaces with humans, emphasizing social compliance and safety [17][18]. - **Sensor Placement Optimization**: Assesses the robustness of perception models under various sensor configurations, crucial for reliable deployment in autonomous systems [23][24]. - **Cross-Modal Drone Navigation**: Involves training models to retrieve aerial images based on natural language descriptions, enhancing the efficiency of urban inspections and disaster responses [29][30]. - **Cross-Platform 3D Object Detection**: Aims to develop models that maintain high performance across different robotic platforms without extensive retraining [35][36]. Group 3: Evaluation and Performance Metrics - Each task includes specific performance metrics and baseline models, with detailed requirements for training and evaluation [16][21][28][42]. - The challenge encourages innovative solutions and provides a prize pool of up to $10,000, shared across the five tracks [42]. Group 4: Timeline and Participation - The challenge will officially start on June 15, 2025, with key deadlines for submissions and evaluations leading up to the award ceremony on October 19, 2025 [4][42]. - Participants are encouraged to engage in this global initiative to advance robotic perception technologies [43].

Autonomous Driving

Multi-modal Perception

Multi-modal Perception

黑武士！科研&教学级自动驾驶全栈小车来啦~

自动驾驶之心· 2025-06-25 09:48

Core Viewpoint - The article introduces the launch of the "Black Warrior 001," a lightweight autonomous driving solution aimed at educational and research purposes, highlighting its various functionalities and applications in different scenarios [3][6]. Group 1: Product Overview - The "Black Warrior 001" supports multiple functionalities including perception, localization, fusion, navigation, and planning, built on an Ackermann chassis [3]. - The product is currently available for pre-sale at a discounted price, with a deposit option to secure orders [2]. Group 2: Functional Demonstrations - The product has been tested in various environments such as indoor, outdoor, and parking scenarios, showcasing its capabilities in perception, localization, fusion, navigation, and planning [4]. - Specific applications include undergraduate learning, graduate research, and as teaching tools in academic laboratories and vocational training institutions [6]. Group 3: Hardware Specifications - Key hardware components include: - 3D LiDAR: Mid 360 - 2D LiDAR: from Lidar Technology - Depth Camera: from Orbbec, equipped with IMU - Main Control Chip: Nvidia Orin NX 16G - Display: 1080p [10]. - The vehicle weighs 30 kg, has a battery power of 50W, and a maximum speed of 2 m/s [12]. Group 4: Software and Functionality - The software framework includes ROS, C++, and Python, allowing for one-click startup and providing a development environment [14]. - Various functionalities supported include 2D and 3D SLAM, vehicle navigation, and obstacle avoidance [15]. Group 5: After-Sales and Maintenance - The product offers one year of after-sales support (excluding human damage), with free repairs for damages caused by operational errors or code modifications during the warranty period [37].

为什么做不好4D自动标注，就做不好智驾量产？

自动驾驶之心· 2025-06-25 09:48

Core Viewpoint - The article emphasizes the importance of efficient 4D data automatic annotation in the development of intelligent driving algorithms, highlighting the challenges and solutions in achieving high-quality annotations for dynamic and static elements in autonomous driving systems [2][6]. Summary by Sections 4D Data Annotation Process - The article outlines the complexity of automatic annotation for dynamic obstacles, which involves multiple modules and requires high-quality data processing to enhance 3D detection performance [2][4]. - It discusses the need for offline single-frame 3D detection results to be linked through tracking, addressing issues such as sensor occlusion and post-processing optimization [4]. Challenges in Automatic Annotation - High spatiotemporal consistency is crucial, necessitating precise tracking of dynamic targets across frames to avoid annotation breaks due to occlusions or interactions [6]. - The complexity of multi-modal data fusion is highlighted, requiring synchronization of data from various sensors like LiDAR and cameras, along with addressing coordinate alignment and semantic unification [6]. - The difficulty in generalizing dynamic scenes is noted, as unpredictable behaviors of traffic participants and environmental factors pose significant challenges to annotation models [6]. - The article points out the contradiction between annotation efficiency and cost, where high-precision 4D automatic annotation relies on manual verification, leading to long cycles and high costs [6]. Educational Course on 4D Annotation - The article promotes a course designed to address the challenges of entering the field of 4D automatic annotation, covering the entire process and core algorithms [7][8]. - The course aims to provide practical training on dynamic obstacle detection, SLAM reconstruction, static element annotation, and end-to-end truth generation [10][11][13][15]. - It emphasizes the importance of real-world applications and hands-on practice to enhance algorithm capabilities [7][22]. Course Structure and Target Audience - The course is structured into several chapters, each focusing on different aspects of 4D automatic annotation, including foundational knowledge, dynamic obstacle marking, and data closure topics [8][10][12][16]. - It is targeted at individuals with a background in deep learning and autonomous driving perception algorithms, including students, researchers, and professionals looking to transition into the field [21][23].

BEV高频面试问题汇总！（纯视觉&多模态融合算法）

自动驾驶之心· 2025-06-25 02:30

Core Viewpoint - The article discusses the rapid advancements in BEV (Bird's Eye View) perception technology, highlighting its significance in the autonomous driving industry and the various companies investing in its development [2]. Group 1: BEV Perception Technology - BEV perception has become a competitive area in visual perception, with various models like BEVDet, PETR, and InternBEV gaining traction since the introduction of BEVFormer [2]. - The technology is being integrated into production by companies such as Horizon, WeRide, XPeng, BYD, and Haomo, indicating a shift towards practical applications in autonomous driving [2]. Group 2: Technical Insights - In BEVFormer, the temporal and spatial self-attention modules utilize BEV queries, with keys and values derived from historical BEV information and image features [3]. - The grid_sample warp in BEVDet4D is explained as a method for transforming coordinates based on camera parameters and predefined BEV grids, facilitating pixel mapping from 2D images to BEV space [3]. Group 3: Algorithm and Performance - Lightweight BEV algorithms such as fast-bev and TRT versions of BEVDet and BEVDepth are noted for their deployment in vehicle systems [5]. - The physical space size corresponding to a BEV bird's eye matrix is typically around 50 meters, with pure visual solutions achieving stable performance up to this distance [6]. Group 4: Community and Collaboration - The article mentions the establishment of a knowledge-sharing platform for the autonomous driving industry, aimed at fostering technical exchanges among students and professionals from various prestigious universities and companies [8].

自动驾驶之心· 2025-06-25 02:30

点击下方卡片，关注" 自动驾驶之心 "公众号戳我-> 领取自动驾驶近15个方向学习路线能辅导哪些会议和期刊？收到了许多同学在论文发表上的求助，学校绕不开一篇三区论文硕士毕业，没有三篇CCF-A博士都毕不了业，老师对这个新的方向不熟悉，开展不了工作。一直在为论文选题绞尽脑汁，实验设计总遇瓶颈，写作逻辑混乱不清，投稿屡屡被拒！尤其是在前沿且复杂的自动驾驶、具身智能、机器人领域，真的有点力不从心！一篇论文往往需要1-2年的时间筹备发出，对硕士来说，基本上贯穿了整个学术生涯。方法错误、走弯路、无人指点是最消耗时间的！论文发表难，但也不是没有办法，有大佬带队，一年发几篇都很正常。筹备了好久，我们服务大家的论文辅导正式推出了，面向自动驾驶/具身智能/机器人领域。我们是谁？国内最大的AI类技术自媒体平台，IP包含自动驾驶之心/具身智能之心/3D视觉之心等平台，拥有国内最顶尖的学术资源。深耕自动驾驶、具身智能、机器人方向多年。我们深刻理解这些交叉学科的挑战与机遇，更明白一篇高质量论文对于学生（尤其是硕博生）学业和未来发展的重要性。我们300+专职于自动驾驶/具身智能方向的老师。来 ...

穆尧团队最新！RoboTwin 2.0：用于鲁棒双臂操作的可扩展数据基准

自动驾驶之心· 2025-06-24 12:41

Core Insights - The article discusses the development of RoboTwin 2.0, a scalable data generation framework aimed at enhancing bimanual robotic manipulation through robust domain randomization and automated expert data generation [2][6][18]. Group 1: Motivation and Challenges - Existing synthetic datasets for bimanual robotic manipulation are insufficient, facing challenges such as lack of efficient data generation methods for new tasks and overly simplified simulation environments [2][5]. - RoboTwin 2.0 addresses these challenges by providing a scalable simulation framework that supports automatic, large-scale generation of diverse and realistic data [2][6]. Group 2: Key Components of RoboTwin 2.0 - RoboTwin 2.0 integrates three key components: an automated expert data generation pipeline, comprehensive domain randomization, and entity-aware adaptation for diverse robotic platforms [6][18]. - The automated expert data generation pipeline utilizes multimodal large language models (MLLMs) and simulation feedback to iteratively optimize task execution code [10][12]. Group 3: Domain Randomization - Domain randomization is applied across five dimensions: clutter, background texture, lighting conditions, desktop height, and diverse language instructions, enhancing the robustness of strategies against environmental variability [12][13]. - The framework generates a large object library (RoboTwin-OD) with 731 instances across 147 categories, each annotated with semantic and operational labels [3][18]. Group 4: Data Collection and Benchmarking - Over 100,000 dual-arm operation trajectories were collected across 50 tasks, supporting extensive benchmarking and evaluation of robotic strategies [24][22]. - The framework allows for flexible entity configurations, ensuring compatibility with diverse hardware setups and promoting scalability for future robotic platforms [20][22]. Group 5: Experimental Analysis - Evaluations demonstrated that RoboTwin 2.0 significantly improves the success rates of robotic tasks, particularly for low-degree-of-freedom platforms, with average increases of 8.3% in task success rates [29][31]. - The framework's data enhances the generalization capabilities of models, showing substantial improvements in performance when tested in unseen scenarios [32][34].

谈薪避坑、跨行转岗？自动驾驶/具身求职，AutoRobo星球一站搞定！

自动驾驶之心· 2025-06-24 12:41

Core Viewpoint - The article emphasizes the rapid advancements in AI technologies, particularly in autonomous driving and embodied intelligence, which have significantly influenced the job market and industry dynamics [2]. Group 1: Industry Developments - Recent breakthroughs in AI technologies, especially in autonomous driving (L2 to L4 functionalities) and robotics, have led to a surge in technical routes and funding [2]. - The establishment of the AutoRobo knowledge community aims to connect job seekers in the fields of autonomous driving, embodied intelligence, and robotics, facilitating better job matching and career development [2][3]. Group 2: Community Offerings - The AutoRobo community provides a wealth of resources, including interview questions, industry reports, salary negotiation tips, and resume optimization services [3][4]. - A comprehensive collection of 100 interview questions related to autonomous driving and embodied intelligence is available for members, covering various technical aspects [6][7]. Group 3: Recruitment Information - The community regularly shares job openings in algorithms, development, and product roles, including campus recruitment, social recruitment, and internships [4]. - Members have access to a variety of interview experiences and insights from successful candidates across different companies in the industry [16]. Group 4: Industry Reports and Insights - The community compiles industry reports that provide insights into the current state, development trends, and market opportunities within the autonomous driving and embodied intelligence sectors [12][15]. - Members can learn about the specific landscape of the embodied intelligence industry, including technological routes and investment opportunities [15].

基于LSD的4D点云底图生成 - 4D标注之点云建图~

自动驾驶之心· 2025-06-24 12:41

作者 | LiangWang 编辑 | 自动驾驶之心点击下方卡片，关注" 自动驾驶之心 "公众号戳我-> 领取自动驾驶近15个方向学习路线 >>点击进入→ 自动驾驶之心『4D标注』技术交流群本文只做学术分享，如有侵权，联系删文近几年随着深度学习技术的发展，基于数据驱动的算法方案在自动驾驶/机器人领域逐渐成为主流，因此算法对数据的要求也越来越大。区别于传统单帧标注，基于高精点云地图的4D标注方案能够有效减少标注成本并提高数据真值质量。 4D标注中的4D是指三维空间+时间维度，4D数据能够映射到任意时刻得到单帧真值用于模型训练，区别于大范围高精地图生产，4D标注只关注一小片区域的静态和动态元素。然而如何生成标注所需底图是其中的一个关键环节，针对不同的标注需求，通常需要实现"单趟建图"，"多躺建图"和"重定位"等关键技术，在场景上还需要支持有GNSS的行车场景和无GNSS的泊车场景。 LSD (LiDAR SLAM & Detection) 是一个开源的面向自动驾驶/机器人的环境感知算法框架，能够完成数据采集回放、多传感器标定、SLAM建图定位和障碍物检测等多种感知任务。本文将详细介 ...

4D Point Cloud Mapping

Autonomous Driving

SLAM (Simultaneous Localization and Mapping)

Autonomous Driving

LSD (LiDAR SLAM & Detection)

4D Point Cloud Mapping

Autonomous Driving

SLAM (Simultaneous Localization and Mapping)

Autonomous Driving

LSD (LiDAR SLAM & Detection)

Previous Next