自动驾驶之心 - filings, earnings calls, financial reports, news

自动驾驶之心

Search documents

自动驾驶之心· 2025-08-22 12:00

"博士生人数暴涨60万岗位却没有增加，2025年博士入职高校的安家费将逐步取消?" 这几天营销号炒这个消息炒的沸沸扬扬的，究竟这个消息是不是真的？大概率是不太靠谱但确实存在缩减趋势，更加精准投放至"成果补贴"、"高层次人才补贴"。安家费不是不发，是缓发、分批法、精准发、"高质量"发。想拿到真金白银，建议大家提升成果水平和竞争力，要知道发高区或者顶会，没有人带是真的发不出来的....要发高区的同学可以现在开始写，到除夕前至少能发5篇论文——CVPR、ICML、ECCV、ACL四篇顶会，IJCAI一篇A会！以上等等问题相信大家写论文的时候都遇到过，为了帮大家提升自身就业/申博竞争力，自动驾驶之心正式推出了服务大家的论文辅导。过去三年，我们从小规模辅导打通交付链路，到现在联手200+全球QS排名前100的老师，从确定选题到投稿返修，全流程一站式1v1，直到论文中稿。柱哥和峰哥做知识付费的初心就是主打一个诚信，不盲目招生，根据每个学生的需求定制流程，近3 年辅导学员超过 400+名，中稿率高达96%。解决导师放养，无人指导，不知道如何做科研的问题解决只了解零散知识，没有清晰的体系的问 ...

从最初的2D方案到当前的VLA大框架，一代又一代的自驾路线是怎么样演变的？

自动驾驶之心· 2025-08-22 04:00

Core Viewpoint - The article emphasizes the importance of creating an engaging learning environment in the field of autonomous driving and AI, aiming to bridge the gap between industry and academia while providing resources for career development and technical knowledge sharing [1][3]. Group 1: Community and Resources - The "Autonomous Driving Heart Knowledge Planet" has evolved through multiple iterations, providing a comprehensive platform for academic and industry exchanges, including job opportunities and technical discussions [1]. - The community has compiled over 40 technical routes and resources, significantly reducing the time needed for information retrieval in the autonomous driving sector [1]. - Members include individuals from renowned universities and leading companies in the autonomous driving field, fostering a rich environment for knowledge sharing [12]. Group 2: Technical Learning and Development - The community offers a structured learning path for newcomers, including foundational knowledge in mathematics, computer vision, and deep learning, as well as practical programming skills [12][20]. - Various learning routes are available, such as end-to-end learning, multi-modal large models, and simulation frameworks, catering to different levels of expertise [12][34]. - The platform provides access to numerous open-source projects and datasets relevant to autonomous driving, enhancing practical learning and application [30][32]. Group 3: Job Opportunities and Networking - The community has established a job referral mechanism with multiple autonomous driving companies, facilitating direct connections between job seekers and employers [6]. - Regular job postings and sharing of internship opportunities are available, helping members stay informed about the latest openings in the industry [11][22]. - Members can engage in discussions about career choices and research directions, receiving guidance from experienced professionals in the field [89]. Group 4: Technical Discussions and Innovations - The community hosts discussions on cutting-edge topics such as VLA (Vision Language Architecture), world models, and diffusion models, keeping members updated on the latest advancements [44][48]. - Regular live sessions with industry experts are conducted, allowing members to learn about new technologies and methodologies in autonomous driving [85]. - The platform encourages collaboration and knowledge exchange, aiming to cultivate future leaders in the autonomous driving industry [3].

端到端全新范式！复旦VeteranAD："感知即规划"刷新开闭环SOTA，超越DiffusionDrive~

自动驾驶之心· 2025-08-21 23:34

Core Insights - The article introduces a novel "perception-in-plan" paradigm for end-to-end autonomous driving, implemented in the VeteranAD framework, which integrates perception directly into the planning process, enhancing the effectiveness of planning optimization [5][39]. - VeteranAD demonstrates superior performance on challenging benchmarks, NAVSIM and Bench2Drive, showcasing the benefits of tightly coupling perception and planning for improved accuracy and safety in autonomous driving [12][39]. Summary by Sections Introduction - The article discusses significant advancements in end-to-end autonomous driving, emphasizing the need to unify multiple tasks within a single framework to prevent information loss across stages [2][3]. Proposed Framework - VeteranAD framework is designed to embed perception into planning, allowing the perception module to operate more effectively in alignment with planning needs [5][6]. - The framework consists of two core modules: Planning-Aware Holistic Perception and Localized Autoregressive Trajectory Planning, which work together to enhance the performance of end-to-end planning tasks [12][39]. Core Modules - **Planning-Aware Holistic Perception**: This module interacts across three dimensions—image features, BEV features, and surrounding traffic features—to achieve a comprehensive understanding of traffic elements [6]. - **Localized Autoregressive Trajectory Planning**: This module generates future trajectories in an autoregressive manner, progressively refining the planned trajectory based on perception results [6][16]. Experimental Results - VeteranAD achieved a PDM Score of 90.2 on the NAVSIM navtest dataset, outperforming previous learning methods and demonstrating its effectiveness in end-to-end planning [21]. - In open-loop evaluations, VeteranAD recorded an average L2 error of 0.60, surpassing all baseline methods, while maintaining competitive performance in closed-loop evaluations [25][33]. Ablation Studies - Ablation studies indicate that the use of guiding points from anchored trajectories is crucial for accurate planning, as removing these points significantly degrades performance [26]. - The combination of both core modules results in enhanced performance, highlighting their complementary nature [26]. Conclusion - The article concludes that the "perception-in-plan" design significantly improves end-to-end planning accuracy and safety, paving the way for future research in more efficient and reliable autonomous driving systems [39].

端到端自动驾驶

perception-in-plan（感知融入规划）

Autonomous Driving

VeteranAD

端到端自动驾驶

perception-in-plan（感知融入规划）

Autonomous Driving

VeteranAD

实测DeepSeek V3.1：不止拓展上下文长度

自动驾驶之心· 2025-08-21 23:34

Core Viewpoint - The article discusses the differences between DeepSeek V3.1 and its predecessor V3, highlighting improvements in programming capabilities, creative writing, translation quality, and response tone. Group 1: Model Comparison - DeepSeek V3.1 has extended its context length to 128K tokens, compared to V3's 65K tokens, allowing for more comprehensive responses [10] - The new version shows significant enhancements in various tasks, including programming, creative writing, translation, and knowledge application [3][4] Group 2: Programming Capability - In a programming test, V3.1 provided a more comprehensive solution for compressing GIF files, considering more factors and providing detailed usage instructions [12][13][14] - The performance of V3.1 was notably faster in executing the task compared to V3 [18] Group 3: Creative Writing - For a creative writing task based on a high school essay prompt, V3.1 produced a more poetic and emotional response, contrasting with V3's more straightforward style [22] Group 4: Translation Quality - In translating a scientific abstract, V3.1 demonstrated a better understanding of complex sentences, although it missed translating a simple word, indicating room for improvement [30] Group 5: Knowledge Application - Both versions provided answers to a niche question about a specific fruit type, with V3.1 showing some inconsistencies in terminology and relevance [31][37] Group 6: Performance Metrics - V3.1 achieved a score of 71.6% on the Aider benchmark, outperforming Claude Opus 4 while being significantly cheaper [43] - On the SVGBench, V3.1 was noted as the best variant among its peers, although it still did not surpass the best open models [44] Group 7: User Feedback - Users have reported various observations regarding the new features and performance of V3.1, including improvements in physical understanding and the introduction of new tokens [45][47]

自动驾驶之心· 2025-08-21 23:34

Core Viewpoint - The article discusses the capabilities of the MindVLA model in autonomous driving, emphasizing its advanced scene understanding and decision-making abilities compared to traditional E2E models. Group 1: VLA Capabilities - The VLA model demonstrates effective defensive driving, particularly in scenarios with obstructed views, by smoothly adjusting speed based on remaining distance [4][5]. - In congested traffic situations, VLA shows improved decision-making by choosing to change lanes rather than following the typical detour logic of E2E models [7]. - The VLA model exhibits enhanced lane centering abilities in non-standard lane widths, significantly reducing the occurrence of erratic driving patterns [9][10]. Group 2: Scene Understanding - VLA's decision-making process reflects a deeper understanding of traffic scenarios, allowing it to make more efficient lane changes and route selections [11]. - The model's ability to maintain stability in trajectory generation is attributed to its use of diffusion models, which enhances its performance in various driving conditions [10]. Group 3: Comparison with E2E Models - The article highlights that E2E models struggle with nuanced driving behaviors, often resulting in abrupt maneuvers, while VLA provides smoother and more context-aware driving responses [3][4]. - VLA's architecture allows for parallel optimization across different scenarios, leading to faster iterations and improvements compared to E2E models [12]. Group 4: Limitations and Future Considerations - Despite its advancements, VLA is still classified as an assistive driving technology rather than fully autonomous driving, requiring human intervention in certain situations [12]. - The article raises questions about the model's performance in specific scenarios, indicating areas for further development and refinement [12].

没有高效的技术和行业信息渠道，很多时间浪费了。。。

自动驾驶之心· 2025-08-21 23:34

Core Insights - The article emphasizes the importance of efficient information collection channels for individuals seeking to transition into the autonomous driving industry, highlighting the establishment of a comprehensive community that integrates academic content, industry discussions, open-source solutions, and job opportunities [1][3]. Group 1: Community and Resources - The community serves as a platform for cultivating future leaders in the autonomous driving field, providing a space for academic and engineering discussions [3]. - The community has grown to over 4,000 members, offering a blend of video content, articles, learning paths, Q&A, and job exchange [1][3]. - A variety of resources are available, including a complete entry-level technical stack and roadmap for newcomers, as well as valuable industry frameworks and project proposals for those already engaged in research [9][11]. Group 2: Learning and Development - The community has compiled extensive resources, including over 40 open-source projects and nearly 60 datasets related to autonomous driving, along with mainstream simulation platforms and various technical learning paths [16]. - Specific learning routes are available for different aspects of autonomous driving, such as perception, simulation, and planning control, catering to both beginners and advanced practitioners [17][16]. - The community also offers a series of video tutorials covering topics like sensor calibration, SLAM, decision-making, and trajectory prediction [5]. Group 3: Job Opportunities and Networking - The community has established internal referral mechanisms with multiple autonomous driving companies, facilitating job placements for members [5]. - Continuous job sharing and position updates are provided, creating a complete ecosystem for autonomous driving professionals [13]. - Members can freely ask questions regarding career choices and research directions, receiving guidance from experienced peers [82].

师兄自己发了篇端到端VLA，申博去TOP2了。。。

自动驾驶之心· 2025-08-21 11:24

Core Viewpoint - The article discusses a research guidance program focused on Vision-Language-Action (VLA) models for autonomous driving, aimed at helping students develop their research skills and produce publishable papers in the field [5][36]. Group 1: Program Overview - The VLA research guidance program includes 12 weeks of online group research, 2 weeks of paper guidance, and 10 weeks of paper maintenance [15][36]. - The program addresses common issues faced by students, such as lack of direction, poor hands-on skills, and difficulties in writing and submitting papers [38]. Group 2: Course Structure - The course is structured into 14 weeks, covering topics from introductory lessons to advanced VLA models and paper writing methodologies [10][12][37]. - Key topics include traditional end-to-end autonomous driving, modular VLA models, and reasoning-enhanced VLA models [10][12][36]. Group 3: Target Audience and Requirements - The program targets students at various academic levels (bachelor's, master's, and doctoral) who are interested in enhancing their research capabilities in autonomous driving and AI [16][36]. - Basic requirements include familiarity with deep learning, Python programming, and the use of PyTorch [22][36]. Group 4: Course Benefits - Participants will gain insights into classic and cutting-edge papers, coding skills, and methodologies for writing and revising papers [21][36]. - The program aims to provide each student with a research idea, enhancing their ability to conduct independent research [21][36]. Group 5: Teaching Methodology - The program employs a "2+1" teaching model, featuring a main instructor and additional support staff to ensure comprehensive learning [24][25]. - Continuous assessment and feedback mechanisms are in place to optimize the learning experience and address individual student needs [25][36].

宁波东方理工大学联培直博生招生！机器人操作/具身智能/机器人学习等方向

自动驾驶之心· 2025-08-21 09:04

Core Viewpoint - The article discusses the collaboration between Ningbo Dongfang University of Technology and prestigious institutions like Shanghai Jiao Tong University and University of Science and Technology of China to recruit doctoral students in the field of robotics, emphasizing a dual mentorship model and a focus on cutting-edge research in robotics and AI [1][2]. Group 1: Program Structure - The program allows students to register at either Shanghai Jiao Tong University or University of Science and Technology of China for the first year, followed by research work at Dongfang University under dual supervision [1]. - Graduates will receive a doctoral degree and diploma from either Shanghai Jiao Tong University or University of Science and Technology of China [1]. Group 2: Research Focus and Support - The research areas include robotics, control, and AI, with specific topics such as contact-rich manipulation, embodied intelligence, agile robot control, and robot learning [2]. - The lab provides ample research funding, administrative support, and encourages a balanced lifestyle for students, including physical exercise [2]. Group 3: Community and Networking - The article promotes a community platform for knowledge sharing in embodied intelligence, aiming to grow from 2,000 to 10,000 members within two years, facilitating discussions on various technical and career-related topics [3][5]. - The community offers resources such as technical routes, job opportunities, and access to industry experts, enhancing networking and collaboration among members [5][18]. Group 4: Educational Resources - The community has compiled extensive resources, including over 30 technical routes, open-source projects, and datasets relevant to embodied intelligence and robotics [17][21][31]. - Members can access a variety of learning materials, including books and research reports, to support their academic and professional development [27][24].

VisionTrap: VLM+LLM教会模型利用视觉特征更好实现轨迹预测

自动驾驶之心· 2025-08-20 23:33

作者 | Sakura 编辑 | 自动驾驶之心原文链接： https://zhuanlan.zhihu.com/p/716867464 点击下方卡片，关注" 自动驾驶之心 "公众号戳我-> 领取自动驾驶近30个方向学习路线 >>自动驾驶前沿信息获取 → 自动驾驶之心知识星球本文只做学术分享，如有侵权，联系删文 VisionTrap: Vision-Augmented Trajectory Prediction Guided by Textual Descriptions 来源 ECCV 2024 开源数据集在这项工作中，我们提出了一种新方法，该方法还结合了来自环视摄像头的视觉输入，使模型能够利用视觉线索，如人类的凝视和手势、道路状况、车辆转向信号等，这些线索在现有方法中通常对模型隐藏。此外，我们使用视觉语言模型（VLM）生成并由大型语言模型（LLM）细化的文本描述作为训练期间的监督，以指导模型从输入数据中学习特征。尽管使用了这些额外的输入，但我们的方法实现了53毫秒的延迟，使其可用于实时处理，这比之前具有类似性能的单代理预测方法快得多。我们的实验表明，视觉输入和文本描述都有助于提高 ...

Trajectory Prediction

Visual-Language Model (VLM)

Large-Language Model (LLM)

Multi-modal Contrastive Learning

Autonomous Driving

VisionTrap

Trajectory Prediction

Visual-Language Model (VLM)

Large-Language Model (LLM)

Multi-modal Contrastive Learning

Autonomous Driving

VisionTrap

蔚来招聘大模型-端到端算法工程师！

自动驾驶之心· 2025-08-20 23:33

Core Viewpoint - The article emphasizes the importance of job opportunities and resources in the fields of autonomous driving and embodied intelligence, highlighting a community platform for job seekers in these sectors. Group 1: Job Opportunities - The platform offers job postings for various positions in algorithm development, product management, and internships related to autonomous driving and robotics [6][24]. - Members of the community include professionals from leading companies in the industry, providing a network for job seekers [4][5]. Group 2: Resources and Support - The community provides a wealth of resources, including interview questions, experience sharing, industry reports, and salary negotiation tips [9][15][19]. - There are specific sections dedicated to various technical topics, such as multi-sensor fusion, trajectory prediction, and occupancy perception, which are crucial for candidates preparing for interviews [10][14]. Group 3: Community Engagement - The platform has nearly 1000 members, facilitating discussions and exchanges among individuals interested in autonomous driving and embodied intelligence [4][5]. - The community encourages collaboration and sharing of experiences, which can help members avoid common pitfalls in the job application process [17][18].