Workflow
自动驾驶之心
icon
Search documents
死磕技术的自动驾驶黄埔军校,4000人了!
自动驾驶之心· 2025-08-15 14:23
Core Viewpoint - The article emphasizes the establishment of a comprehensive community focused on autonomous driving, aiming to bridge the gap between academia and industry while providing valuable resources for learning and career opportunities in the field [2][16]. Group 1: Community and Resources - The community has created a closed-loop system covering various fields such as industry, academia, job seeking, and Q&A exchanges, enhancing the learning experience for participants [2][3]. - The platform offers cutting-edge academic content, industry roundtables, open-source code solutions, and timely job information, significantly reducing the time needed for research [3][16]. - Members can access nearly 40 technical routes, including industry applications, VLA benchmarks, and entry-level learning paths, catering to both beginners and advanced researchers [3][16]. Group 2: Learning and Development - The community provides a well-structured learning path for beginners, including foundational knowledge in mathematics, computer vision, deep learning, and programming [10][12]. - For those already engaged in research, valuable industry frameworks and project proposals are available to further their understanding and application of autonomous driving technologies [12][14]. - Continuous job sharing and career opportunities are promoted within the community, fostering a complete ecosystem for autonomous driving [14][16]. Group 3: Technical Focus Areas - The community has compiled extensive resources on various technical aspects of autonomous driving, including perception, simulation, planning, and control [16][17]. - Specific learning routes are available for topics such as end-to-end learning, 3DGS principles, and multi-modal large models, ensuring comprehensive coverage of the field [16][17]. - The platform also features a collection of open-source projects and datasets relevant to autonomous driving, facilitating hands-on experience and practical application [32][34].
想学习更多大模型知识,如何系统的入门大?
自动驾驶之心· 2025-08-14 23:33
Group 1 - The article emphasizes the growing interest in large model technologies, particularly in areas such as RAG (Retrieval-Augmented Generation), AI Agents, multimodal large models (pre-training, fine-tuning, reinforcement learning), and optimization for deployment and inference [1] - A community named "Large Model Heart Tech" is being established to focus on these technologies and aims to become the largest domestic community for large model technology [1] - The community is also creating a knowledge platform to provide industry and academic information, as well as to cultivate talent in the field of large models [1] Group 2 - The article describes the community as a serious content-driven platform aimed at nurturing future leaders [2]
自动驾驶现在关注哪些技术方向?应该如何入门?
自动驾驶之心· 2025-08-14 23:33
Core Viewpoint - The article emphasizes the establishment of a comprehensive community for autonomous driving, aiming to bridge communication between enterprises and academic institutions, while providing resources and support for individuals interested in the field [1][12]. Group 1: Community and Resources - The community has organized over 40 technical routes, offering resources for both beginners and advanced researchers in autonomous driving [1][13]. - Members include individuals from renowned universities and leading companies in the autonomous driving sector, fostering a collaborative environment for knowledge sharing [13][21]. - The community provides a complete entry-level technical stack and roadmap for newcomers, as well as valuable industry frameworks and project proposals for those already engaged in research [7][9]. Group 2: Learning and Development - The community offers a variety of learning routes, including perception, simulation, and planning control, to facilitate quick onboarding for newcomers and further development for those already familiar with the field [13][31]. - There are numerous open-source projects and datasets available, covering areas such as 3D object detection, BEV perception, and world models, which are essential for practical applications in autonomous driving [27][29][35]. Group 3: Job Opportunities and Networking - The community actively shares job postings and career opportunities, helping members connect with potential employers in the autonomous driving industry [11][18]. - Members can engage in discussions about career choices and research directions, receiving guidance from experienced professionals in the field [77][80]. Group 4: Technical Discussions and Innovations - The community hosts discussions on cutting-edge topics such as end-to-end driving, multi-modal models, and the integration of various technologies in autonomous systems [20][39][42]. - Regular live sessions with industry leaders are conducted, allowing members to gain insights into the latest advancements and practical applications in autonomous driving [76][80].
万字解析DeepSeek MOE架构!
自动驾驶之心· 2025-08-14 23:33
Core Viewpoint - The article provides a comprehensive overview of the Mixture of Experts (MoE) architecture, particularly focusing on the evolution and implementation of DeepSeek's MoE models (V1, V2, V3) and their optimizations in handling token distribution and load balancing in AI models [2][21][36]. Group 1: MoE Architecture Overview - MoE, or Mixture of Experts, is a model architecture that utilizes multiple expert networks to enhance performance, particularly in sparse settings suitable for cloud computing [2][3]. - The initial interest in MoE architecture surged with the release of Mistral.AI's Mixtral model, which highlighted the potential of sparse architectures in AI [2][3]. - The Switch Transformer model introduced a routing mechanism that allows tokens to select the top-K experts, optimizing the processing of diverse knowledge [6][10]. Group 2: DeepSeek V1 Innovations - DeepSeek V1 addresses two main issues in existing MoE practices: knowledge mixing and redundancy, which hinder expert specialization [22][24]. - The model introduces fine-grained expert division and shared experts to enhance specialization and reduce redundancy, allowing for more efficient knowledge capture [25][26]. - The architecture includes a load balancing mechanism to ensure even distribution of tokens across experts, mitigating training inefficiencies [32]. Group 3: DeepSeek V2 Enhancements - DeepSeek V2 builds on V1's design, implementing three key optimizations focused on load balancing [36]. - The model limits the number of devices used for routing experts to reduce communication overhead, enhancing efficiency during training and inference [37]. - A new communication load balancing loss function is introduced to ensure equitable token distribution across devices, further optimizing performance [38]. Group 4: DeepSeek V3 Developments - DeepSeek V3 introduces changes in the MOE layer computation, replacing the softmax function with a sigmoid function to improve computational efficiency [44]. - The model eliminates auxiliary load balancing losses, instead using a learnable bias term to control routing, which enhances load balancing during training [46]. - A sequence-level auxiliary loss is added to prevent extreme imbalances within individual sequences, ensuring a more stable training process [49].
GRPO并非最优解?EvaDrive:全新RL算法APO,类人端到端更进一步(新加坡国立)
自动驾驶之心· 2025-08-14 23:33
点击下方 卡片 ,关注" 自动驾驶之心 "公众号 戳我-> 领取 自动驾驶近30个 方向 学习 路线 今天自动驾驶之心为大家分享 新加坡国立、清华和小米等团队最新的工作 - EvaDrive ! 全新强化学习算法APO,开闭环新SOTA。如 果您有相关工作需要分享,请在文末联系我们! 自动驾驶课程学习与 技术交流群加入 ,也欢迎添加小助理微信AIDriver005 >>自动驾驶前沿信息获取 → 自动驾驶之心知识星球 论文作者 | Siwen Jiao等 编辑 | 自动驾驶之心 最近很多端到端方向的工作!今天自动驾驶之心为大家分享新加坡国立、清华和小米等团队最新的工作 - EvaDrive。这篇工作认为: 为了解决这些问题,EvaDrive应运而生 - 一个全新的多目标强化学习框架,通过对抗性优化在轨迹生成和评测之间建立真正的闭环协同进化。EvaDrive将轨迹规划 表述为多轮对抗游戏。在这个游戏中,分层生成器通过结合自回归意图建模以捕捉时间因果关系和基于扩散的优化以提供空间灵活性,持续提出候选路径。然 后,一个可训练的多目标critic对这些proposal进行严格评测,明确保留多样化的偏好结构,而不将其压缩 ...
北大最新ReconDreamer-RL:基于扩散场景重建的强化学习框架,碰撞率降低5倍!
自动驾驶之心· 2025-08-14 11:12
Core Insights - The article discusses the challenges and advancements in end-to-end autonomous driving models, particularly focusing on closed-loop simulation reinforcement learning, which enhances robustness and adaptability through interaction with diverse environments [1] Group 1: Research Background and Core Challenges - Closed-loop reinforcement learning is gaining attention as it allows models to interact with environments, improving robustness and adaptability compared to imitation learning [1] - Two main challenges are identified: insufficient realism in simulation environments and uneven training data distribution, which limits model generalization [5][6] Group 2: Core Framework: ReconDreamer-RL - The ReconDreamer-RL framework integrates video diffusion priors and scene reconstruction, consisting of three core components that optimize autonomous driving strategies in two phases: imitation learning and reinforcement learning [3] Group 3: Components of ReconDreamer-RL - **ReconSimulator**: A high-fidelity simulation environment that combines appearance modeling and physics modeling to reduce the sim2real gap. It utilizes 3D Gaussian splatting for scene reconstruction and DriveRestorer for video artifact correction [4][7] - **Dynamic Adversary Agent (DAA)**: Generates extreme scenarios by controlling surrounding vehicle trajectories to create complex interactions like sudden lane changes and hard braking [8] - **Cousin Trajectory Generator (CTG)**: Enhances trajectory diversity by generating varied trajectories through trajectory extension and interpolation, addressing the bias towards simple linear movements in training data [10][12] Group 4: Experimental Validation: Performance and Advantages - The framework significantly reduces collision rates, achieving a collision rate of 0.077 compared to 0.386 for imitation learning methods and 0.238 for reinforcement learning methods, marking a reduction of approximately 5 times [16] - In extreme scenarios, the framework's collision rate drops to 0.053, showcasing a 404.5% improvement over traditional methods [18] - Ablation studies confirm the effectiveness of each component, with the removal of ReconSimulator leading to a collision rate increase from 0.077 to 0.238, highlighting the necessity of realistic simulation environments [20][22] Group 5: Rendering Efficiency - The rendering speed of ReconSimulator reaches 125 FPS, significantly surpassing other methods like EmerNeRF, which operates at 0.21 FPS, thus meeting the real-time interaction requirements for reinforcement learning [21]
自动驾驶VLA论文指导班第二期来啦,名额有限...
自动驾驶之心· 2025-08-14 06:49
Core Insights - The article discusses the advancements of the Li Auto VLA driver model, highlighting its improved capabilities in understanding semantics, reasoning, and trajectory planning, which are crucial for autonomous driving [1][3][5] Group 1: VLA Model Capabilities - The VLA model demonstrates enhanced semantic understanding through multimodal input, improved reasoning via thinking chains, and a closer approximation to human driving intuition through trajectory planning [1] - Four core abilities of the VLA model are showcased: spatial understanding, reasoning ability, communication and memory capability, and behavioral ability [1][3] Group 2: Research and Development Trends - The VLA model has evolved from VLM+E2E, integrating various cutting-edge technologies such as end-to-end learning, trajectory prediction, visual language models, and reinforcement learning [5] - While traditional perception and planning tasks are still being optimized in the industry, the academic community is increasingly shifting focus towards large models and VLA, indicating a wealth of subfields still open for exploration [5] Group 3: VLA Research Guidance Program - A second session of the VLA research paper guidance program is being launched, aimed at helping participants systematically grasp key theoretical knowledge and develop their own research ideas [6][31] - The program includes a structured curriculum over 12 weeks of online group research, followed by 2 weeks of paper guidance and a 10-week maintenance period for paper development [14][31] Group 4: Course Structure and Requirements - The course is designed for a maximum of 8 participants, focusing on those pursuing master's or doctoral degrees in VLA and autonomous driving, as well as professionals in the AI field seeking to enhance their algorithmic knowledge [12][13] - Participants are expected to have a foundational understanding of deep learning, basic programming skills in Python, and familiarity with PyTorch [19][20] Group 5: Course Outcomes - Participants will gain insights into classic and cutting-edge papers, coding implementations, and methodologies for selecting research topics, conducting experiments, and writing papers [14][31] - The program aims to produce a draft of a research paper, enhancing participants' academic profiles for further studies or employment opportunities [14][31]
蔚来招聘大模型-端到端算法工程师!
自动驾驶之心· 2025-08-14 03:36
Core Viewpoint - The article emphasizes the importance of job opportunities and resources in the fields of autonomous driving and embodied intelligence, highlighting a community platform for job seekers in these sectors. Group 1: Job Descriptions and Requirements - The position involves designing and developing end-to-end algorithms for intelligent assisted driving, including BEV perception, Lidar perception, occupancy networks, and multi-modal large models [1] - Candidates with experience in deep learning, object detection, and reinforcement learning algorithms are preferred, along with a background in computer science or electronics [2] - Proficiency in the PyTorch deep learning framework and good communication skills are essential for applicants [2] Group 2: Community and Resources - The AutoRobo knowledge community has nearly 1,000 members, including professionals from various companies in the autonomous driving and robotics sectors [4] - The community provides resources such as interview questions, industry reports, salary negotiation tips, and internal job postings for various positions [5][6] - A compilation of 100 interview questions related to autonomous driving and embodied intelligence is available for members [9] Group 3: Industry Reports and Insights - The community offers in-depth industry reports to help members understand the current state and future prospects of the autonomous driving and embodied intelligence sectors [15] - Reports cover various topics, including the development trends and market opportunities within the embodied intelligence industry [15] Group 4: Interview Experiences and Tips - The community shares both successful and unsuccessful interview experiences to help members learn from past mistakes and improve their interview skills [17] - Insights on salary negotiation and common HR questions are also provided to assist job seekers [19][21]
手持3D扫描仪!超高性价比可在线实时重建点云~
自动驾驶之心· 2025-08-13 23:33
Core Viewpoint - The GeoScan S1 is presented as the most cost-effective 3D laser scanner in China, designed for various applications such as campus and indoor scene reconstruction, featuring lightweight design and user-friendly operation [1][7]. Group 1: Product Features - The GeoScan S1 offers centimeter-level precision in real-time 3D scene reconstruction using a multi-modal sensor fusion algorithm [1]. - It generates point clouds at a rate of 200,000 points per second, with a maximum measurement distance of 70 meters and 360° coverage, supporting large scenes over 200,000 square meters [1][27]. - The device is equipped with a built-in Ubuntu system and various sensor devices, allowing for flexible power supply and integration with other equipment [3][10]. Group 2: User Experience - The scanner is designed for ease of use, allowing users to start scanning with a single button and export results without complex setups [5]. - It features high efficiency and accuracy in mapping, enabling users to easily scan large areas while maintaining model precision [5][25]. - The device supports real-time modeling and high-quality color point cloud generation through advanced multi-sensor SLAM algorithms [25][32]. Group 3: Market Positioning - The GeoScan S1 is marketed as having the best price-performance ratio in the industry, with a starting price of 19,800 yuan for the basic version [7][56]. - The product is available in multiple versions, including a depth camera version and 3DGS online and offline versions, catering to diverse customer needs [56]. - The company emphasizes its strong background and project validation through collaborations with academic institutions, enhancing credibility in the market [7]. Group 4: Application Scenarios - The GeoScan S1 is suitable for various environments, including office buildings, parking lots, industrial parks, tunnels, forests, and mines, demonstrating its versatility in 3D mapping [36][45]. - It supports cross-platform integration, making it compatible with drones, unmanned vehicles, and robots for automated operations [42]. Group 5: Technical Specifications - The device features a compact design with dimensions of 14.2cm x 9.5cm x 45cm and weighs 1.3kg without the battery [20]. - It operates on a power input of 13.8V - 24V with a battery capacity of 88.8Wh, providing approximately 3 to 4 hours of runtime [20]. - The GeoScan S1 supports various data export formats, including PCD, LAS, and PLV, ensuring compatibility with different software [20].
NVIDIA英伟达进入自动驾驶领域二三事
自动驾驶之心· 2025-08-13 23:33
Core Viewpoint - The article discusses the evolution of the partnership between Tesla and NVIDIA in the autonomous driving sector, highlighting the challenges and innovations that have shaped their collaboration. Group 1: Tesla's Journey in Autonomous Driving - In September 2013, Tesla officially entered the autonomous driving arena, emphasizing internal development rather than relying on external technologies [5] - Initially, Tesla partnered with Mobileye due to the lack of suitable self-developed autonomous driving chips, enhancing Mobileye's technology with unique innovations like Fleet Learning [9][12] - Tensions arose between Tesla and Mobileye as Tesla sought to develop its own algorithms, leading to Mobileye's demand for Tesla to halt its internal vision efforts [12][13] Group 2: NVIDIA's Strategic Shift - In 2012, NVIDIA's CEO Jensen Huang recognized the potential of autonomous driving in electric vehicles, leading to a focus on deep learning and computer vision [15] - By November 2013, Huang highlighted the importance of digital computing in modern vehicles, indicating a shift towards automation in the automotive industry [17] - In January 2015, NVIDIA launched the DRIVE brand, introducing the DRIVE PX platform, which provided significant computational power for autonomous driving applications [18] Group 3: The Partnership Development - Following a significant accident in May 2016, Mobileye ended its partnership with Tesla, prompting Tesla to choose NVIDIA as its new technology partner [19][20] - In October 2016, Tesla announced that all its production models would feature hardware capable of full self-driving capabilities, utilizing NVIDIA's DRIVE PX 2 platform [20] - By early 2017, Tesla publicly announced its plans to develop its own chips, indicating a shift in its strategy while NVIDIA continued to expand its automotive partnerships [25][26] Group 4: Technological Advancements - In 2018, NVIDIA introduced the DRIVE Xavier platform, which improved computational performance while reducing power consumption [28] - Tesla's HW3, launched in April 2019, was described by Musk as the most advanced computer designed specifically for autonomous driving, marking the end of NVIDIA's direct involvement in Tesla's autonomous driving hardware [30][32]