Workflow
自动驾驶之心
icon
Search documents
黑武士!科研&教学级自动驾驶全栈小车来啦~
自动驾驶之心· 2025-06-25 09:48
Core Viewpoint - The article introduces the launch of the "Black Warrior 001," a lightweight autonomous driving solution aimed at educational and research purposes, highlighting its various functionalities and applications in different scenarios [3][6]. Group 1: Product Overview - The "Black Warrior 001" supports multiple functionalities including perception, localization, fusion, navigation, and planning, built on an Ackermann chassis [3]. - The product is currently available for pre-sale at a discounted price, with a deposit option to secure orders [2]. Group 2: Functional Demonstrations - The product has been tested in various environments such as indoor, outdoor, and parking scenarios, showcasing its capabilities in perception, localization, fusion, navigation, and planning [4]. - Specific applications include undergraduate learning, graduate research, and as teaching tools in academic laboratories and vocational training institutions [6]. Group 3: Hardware Specifications - Key hardware components include: - 3D LiDAR: Mid 360 - 2D LiDAR: from Lidar Technology - Depth Camera: from Orbbec, equipped with IMU - Main Control Chip: Nvidia Orin NX 16G - Display: 1080p [10]. - The vehicle weighs 30 kg, has a battery power of 50W, and a maximum speed of 2 m/s [12]. Group 4: Software and Functionality - The software framework includes ROS, C++, and Python, allowing for one-click startup and providing a development environment [14]. - Various functionalities supported include 2D and 3D SLAM, vehicle navigation, and obstacle avoidance [15]. Group 5: After-Sales and Maintenance - The product offers one year of after-sales support (excluding human damage), with free repairs for damages caused by operational errors or code modifications during the warranty period [37].
为什么做不好4D自动标注,就做不好智驾量产?
自动驾驶之心· 2025-06-25 09:48
Core Viewpoint - The article emphasizes the importance of efficient 4D data automatic annotation in the development of intelligent driving algorithms, highlighting the challenges and solutions in achieving high-quality annotations for dynamic and static elements in autonomous driving systems [2][6]. Summary by Sections 4D Data Annotation Process - The article outlines the complexity of automatic annotation for dynamic obstacles, which involves multiple modules and requires high-quality data processing to enhance 3D detection performance [2][4]. - It discusses the need for offline single-frame 3D detection results to be linked through tracking, addressing issues such as sensor occlusion and post-processing optimization [4]. Challenges in Automatic Annotation - High spatiotemporal consistency is crucial, necessitating precise tracking of dynamic targets across frames to avoid annotation breaks due to occlusions or interactions [6]. - The complexity of multi-modal data fusion is highlighted, requiring synchronization of data from various sensors like LiDAR and cameras, along with addressing coordinate alignment and semantic unification [6]. - The difficulty in generalizing dynamic scenes is noted, as unpredictable behaviors of traffic participants and environmental factors pose significant challenges to annotation models [6]. - The article points out the contradiction between annotation efficiency and cost, where high-precision 4D automatic annotation relies on manual verification, leading to long cycles and high costs [6]. Educational Course on 4D Annotation - The article promotes a course designed to address the challenges of entering the field of 4D automatic annotation, covering the entire process and core algorithms [7][8]. - The course aims to provide practical training on dynamic obstacle detection, SLAM reconstruction, static element annotation, and end-to-end truth generation [10][11][13][15]. - It emphasizes the importance of real-world applications and hands-on practice to enhance algorithm capabilities [7][22]. Course Structure and Target Audience - The course is structured into several chapters, each focusing on different aspects of 4D automatic annotation, including foundational knowledge, dynamic obstacle marking, and data closure topics [8][10][12][16]. - It is targeted at individuals with a background in deep learning and autonomous driving perception algorithms, including students, researchers, and professionals looking to transition into the field [21][23].
BEV高频面试问题汇总!(纯视觉&多模态融合算法)
自动驾驶之心· 2025-06-25 02:30
Core Viewpoint - The article discusses the rapid advancements in BEV (Bird's Eye View) perception technology, highlighting its significance in the autonomous driving industry and the various companies investing in its development [2]. Group 1: BEV Perception Technology - BEV perception has become a competitive area in visual perception, with various models like BEVDet, PETR, and InternBEV gaining traction since the introduction of BEVFormer [2]. - The technology is being integrated into production by companies such as Horizon, WeRide, XPeng, BYD, and Haomo, indicating a shift towards practical applications in autonomous driving [2]. Group 2: Technical Insights - In BEVFormer, the temporal and spatial self-attention modules utilize BEV queries, with keys and values derived from historical BEV information and image features [3]. - The grid_sample warp in BEVDet4D is explained as a method for transforming coordinates based on camera parameters and predefined BEV grids, facilitating pixel mapping from 2D images to BEV space [3]. Group 3: Algorithm and Performance - Lightweight BEV algorithms such as fast-bev and TRT versions of BEVDet and BEVDepth are noted for their deployment in vehicle systems [5]. - The physical space size corresponding to a BEV bird's eye matrix is typically around 50 meters, with pure visual solutions achieving stable performance up to this distance [6]. Group 4: Community and Collaboration - The article mentions the establishment of a knowledge-sharing platform for the autonomous driving industry, aimed at fostering technical exchanges among students and professionals from various prestigious universities and companies [8].
为什么一篇论文要耗尽整个研究生生涯?
自动驾驶之心· 2025-06-25 02:30
点击下方 卡片 ,关注" 自动驾驶之心 "公众号 戳我-> 领取 自动驾驶近15个 方向 学习 路线 能辅导哪些会议和期刊? 收到了许多同学在论文发表上的求助,学校绕不开一篇三区论文硕士毕业,没有三篇CCF-A博士都毕不了 业,老师对这个新的方向不熟悉,开展不了工作。一直在为论文选题绞尽脑汁,实验设计总遇瓶颈,写作 逻辑混乱不清,投稿屡屡被拒! 尤其是在前沿且复杂的自动驾驶、具身智能、机器人领域,真的有点力不 从心! 一篇论文往往需要1-2年的时间筹备发出,对硕士来说,基本上贯穿了整个学术生涯。方法错误、走弯路、 无人指点是最消耗时间的!论文发表难,但也不是没有办法,有大佬带队,一年发几篇都很正常。筹备了 好久,我们服务大家的论文辅导正式推出了,面向自动驾驶/具身智能/机器人领域。 我们是谁? 国内最大的AI类技术自媒体平台,IP包含自动驾驶之心/具身智能之心/3D视觉之心等平台,拥有国内最顶 尖的学术资源。深耕 自动驾驶、具身智能、机器人 方向多年。我们深刻理解这些交叉学科的挑战与机遇, 更明白一篇高质量论文对于学生(尤其是硕博生)学业和未来发展的重要性。 我们300+专职于自动驾驶/具身智能方向的老师。来 ...
穆尧团队最新!RoboTwin 2.0:用于鲁棒双臂操作的可扩展数据基准
自动驾驶之心· 2025-06-24 12:41
Core Insights - The article discusses the development of RoboTwin 2.0, a scalable data generation framework aimed at enhancing bimanual robotic manipulation through robust domain randomization and automated expert data generation [2][6][18]. Group 1: Motivation and Challenges - Existing synthetic datasets for bimanual robotic manipulation are insufficient, facing challenges such as lack of efficient data generation methods for new tasks and overly simplified simulation environments [2][5]. - RoboTwin 2.0 addresses these challenges by providing a scalable simulation framework that supports automatic, large-scale generation of diverse and realistic data [2][6]. Group 2: Key Components of RoboTwin 2.0 - RoboTwin 2.0 integrates three key components: an automated expert data generation pipeline, comprehensive domain randomization, and entity-aware adaptation for diverse robotic platforms [6][18]. - The automated expert data generation pipeline utilizes multimodal large language models (MLLMs) and simulation feedback to iteratively optimize task execution code [10][12]. Group 3: Domain Randomization - Domain randomization is applied across five dimensions: clutter, background texture, lighting conditions, desktop height, and diverse language instructions, enhancing the robustness of strategies against environmental variability [12][13]. - The framework generates a large object library (RoboTwin-OD) with 731 instances across 147 categories, each annotated with semantic and operational labels [3][18]. Group 4: Data Collection and Benchmarking - Over 100,000 dual-arm operation trajectories were collected across 50 tasks, supporting extensive benchmarking and evaluation of robotic strategies [24][22]. - The framework allows for flexible entity configurations, ensuring compatibility with diverse hardware setups and promoting scalability for future robotic platforms [20][22]. Group 5: Experimental Analysis - Evaluations demonstrated that RoboTwin 2.0 significantly improves the success rates of robotic tasks, particularly for low-degree-of-freedom platforms, with average increases of 8.3% in task success rates [29][31]. - The framework's data enhances the generalization capabilities of models, showing substantial improvements in performance when tested in unseen scenarios [32][34].
谈薪避坑、跨行转岗?自动驾驶/具身求职,AutoRobo星球一站搞定!
自动驾驶之心· 2025-06-24 12:41
Core Viewpoint - The article emphasizes the rapid advancements in AI technologies, particularly in autonomous driving and embodied intelligence, which have significantly influenced the job market and industry dynamics [2]. Group 1: Industry Developments - Recent breakthroughs in AI technologies, especially in autonomous driving (L2 to L4 functionalities) and robotics, have led to a surge in technical routes and funding [2]. - The establishment of the AutoRobo knowledge community aims to connect job seekers in the fields of autonomous driving, embodied intelligence, and robotics, facilitating better job matching and career development [2][3]. Group 2: Community Offerings - The AutoRobo community provides a wealth of resources, including interview questions, industry reports, salary negotiation tips, and resume optimization services [3][4]. - A comprehensive collection of 100 interview questions related to autonomous driving and embodied intelligence is available for members, covering various technical aspects [6][7]. Group 3: Recruitment Information - The community regularly shares job openings in algorithms, development, and product roles, including campus recruitment, social recruitment, and internships [4]. - Members have access to a variety of interview experiences and insights from successful candidates across different companies in the industry [16]. Group 4: Industry Reports and Insights - The community compiles industry reports that provide insights into the current state, development trends, and market opportunities within the autonomous driving and embodied intelligence sectors [12][15]. - Members can learn about the specific landscape of the embodied intelligence industry, including technological routes and investment opportunities [15].
基于LSD的4D点云底图生成 - 4D标注之点云建图~
自动驾驶之心· 2025-06-24 12:41
作者 | LiangWang 编辑 | 自动驾驶之心 点击下方 卡片 ,关注" 自动驾驶之心 "公众号 戳我-> 领取 自动驾驶近15个 方向 学习 路线 >>点击进入→ 自动驾驶之心 『4D标注』技术交流群 本文只做学术分享,如有侵权,联系删文 近几年随着深度学习技术的发展,基于数据驱动的算法方案在自动驾驶/机器人领域逐渐成为主流,因此算法对数据的要求也越来越大。区别于传统单帧标注,基 于高精点云地图的4D标注方案能够有效减少标注成本并提高数据真值质量。 4D标注中的4D是指三维空间+时间维度,4D数据能够映射到任意时刻得到单帧真值用于模型训练,区别于大范围高精地图生产,4D标注只关注一小片区域的静态 和动态元素。然而如何生成标注所需底图是其中的一个关键环节,针对不同的标注需求,通常需要实现"单趟建图","多躺建图"和"重定位"等关键技术,在场景上 还需要支持有GNSS的行车场景和无GNSS的泊车场景。 LSD (LiDAR SLAM & Detection) 是一个开源的面向自动驾驶/机器人的环境感知算法框架,能够完成数据采集回放、多传感器标定、SLAM建图定位和障碍物检测 等多种感知任务。 本文将详细介 ...
大佬面对面!斯坦福2025 CS336课程全公开:从零开始搓大模型~
自动驾驶之心· 2025-06-24 11:47
Core Viewpoint - The article discusses the launch of Stanford University's CS336 course "Language Models from Scratch," which aims to provide a comprehensive understanding of language models through practical development and implementation [5][7]. Course Overview - The course focuses on the foundational aspects of language models, which are essential for modern natural language processing (NLP) applications. It emphasizes the importance of understanding language models for scientists and engineers in the fields of AI and ML [5][7]. - The course is structured into five major modules: Foundations, Systems, Extensions, Data, and Alignment & Reinforcement Learning [7]. Course Requirements - Students are expected to have proficiency in Python, as most assignments will require extensive coding. The course will provide minimal scaffolding, resulting in a higher volume of code written by students compared to other AI courses [7]. - A background in deep learning and system optimization is necessary, particularly familiarity with PyTorch and basic system concepts like memory hierarchy [7]. - Foundational knowledge in calculus, linear algebra, probability, and statistics is required, along with a basic understanding of machine learning principles [7]. Assignments - The course includes several assignments that cover various aspects of language model development, such as implementing a BPE tokenizer, training models on specific datasets, and optimizing performance on GPUs [8]. - Assignments are designed to simulate real-world challenges, including data processing and model alignment, with a focus on practical application and hands-on experience [8]. Course Schedule - The course is structured with a detailed schedule that outlines topics, materials, and deadlines for assignments, ensuring a systematic approach to learning [9].
华为车BU招聘(端到端/感知模型/模型优化等)!岗位多多~
自动驾驶之心· 2025-06-24 07:21
Core Viewpoint - The article emphasizes the rapid evolution and commercialization of autonomous driving technologies, highlighting the importance of community engagement and knowledge sharing in this field [9][14][19]. Group 1: Job Opportunities and Community Engagement - Huawei is actively recruiting for various positions in its autonomous driving division, including roles focused on end-to-end model algorithms, perception models, and efficiency optimization [1][2]. - The "Autonomous Driving Heart Knowledge Planet" serves as a platform for technical exchange, targeting students and professionals in the autonomous driving and AI sectors, and has established connections with numerous industry companies for job referrals [7][14][15]. Group 2: Technological Trends and Future Directions - The article outlines that by 2025, the focus will be on advanced technologies such as visual large language models (VLM), end-to-end trajectory prediction, and 3D generative simulations, indicating a shift towards more integrated and intelligent systems in autonomous driving [9][22]. - The community has developed over 30 learning pathways covering various subfields of autonomous driving, including perception, mapping, and AI model deployment, which are crucial for industry professionals [19][21]. Group 3: Educational Resources and Content - The knowledge platform offers exclusive rights to members, including access to academic advancements, professional Q&A sessions, and discounts on courses, fostering a comprehensive learning environment [17][19]. - Regular webinars featuring experts from top conferences and companies are organized to discuss practical applications and research in autonomous driving, enhancing the learning experience for participants [21][22].
SwitchVLA:无需额外数据采集,即可实时动态任务切换的轻量化VLA模型
自动驾驶之心· 2025-06-24 02:54
Core Viewpoint - The article introduces SwitchVLA, a lightweight and data-efficient method for dynamic task perception and decision-making, addressing the challenges of task switching in multi-task VLA models, achieving superior performance compared to existing methods [3][22]. Group 1: Introduction - Current mainstream multi-task VLA models struggle with task switching, defined as "Task Switching," where the model's ability to adapt to new tasks mid-execution is limited [3][5]. - SwitchVLA employs an Execution-Aware mechanism and a lightweight network architecture to facilitate task switching without the need for additional data collection [3][10]. Group 2: Background - Multi-task VLA training typically involves independent data collection for each task, leading to challenges in seamlessly transitioning between tasks [5]. - The inability of existing SOTA VLA methods to effectively handle task switching is highlighted, emphasizing the need for improved solutions [5][10]. Group 3: Methodology - SwitchVLA addresses two core problems: representing task switching without extra data collection and training an end-to-end imitation learning model that autonomously judges based on current conditions [10][12]. - The model improves task switching representation by concatenating previous task, current task, and the previous task's stage, enhancing the model's ability to perceive task transitions [12][13]. - A simplified training process categorizes tasks into three stages: before contact, during contact, and after contact, allowing for effective task switching without additional data [15][16]. Group 4: Experimental Results - Experiments demonstrate that SwitchVLA outperforms existing methods in task switching scenarios while maintaining comparable performance in single-task settings [20][22]. - The analysis of task switching failures reveals that the proposed method effectively mitigates common failure causes [20]. Group 5: Conclusion and Future Directions - SwitchVLA is positioned as a significant advancement in dynamic task management, with plans for further iterations and deployment in humanoid robots for applications in flexible industrial production and personalized commercial services [22][23].