Workflow
自动驾驶之心
icon
Search documents
理解 RL学习的本质!
自动驾驶之心· 2025-12-15 00:04
Core Viewpoint - The article discusses the limitations of Reinforcement Learning (RL) in enhancing the reasoning capabilities of Large Language Models (LLMs), emphasizing that RL does not extend the inherent capabilities of the models but rather improves search efficiency within existing boundaries [4][5][7]. Group 1: RL Learning Limitations - A recent paper from Tsinghua's LEAP lab concluded that RL learning does not enable LLMs to surpass the reasoning abilities of their base models, as RL models only improve search efficiency without solving problems that base models cannot [4][5]. - The evaluation method used, pass@K, showed that while RL models perform better than base models at K=1, their performance converges as K increases, eventually being surpassed by base models at larger K values [4][7]. - RL models exhibit a polarized accuracy distribution, performing well on high-accuracy tasks but poorly on low-accuracy ones, indicating a tendency to excel in specific areas while failing in others [8][9]. Group 2: Comparison with Distillation Learning - Unlike RL, Distillation Learning (SFT) can expand a model's capabilities, allowing it to learn to solve problems it previously could not address [12]. - The limitations of RL are attributed to a "double-edged sword" effect of pre-training priors, which restrict exploration and reinforce existing solutions rather than discovering new paths [14][15]. - The article suggests that a balance between exploration and exploitation in training methods could enhance model performance without narrowing the exploration range [15]. Group 3: Parameter Update Characteristics - A paper from Meta explains that RL training features localized parameter updates, which leads to a consistent optimization bias that limits exploration [18][21]. - The "three gates" theory describes how RL imposes constraints on updates, preventing significant deviations from the model's original distribution and avoiding high-curvature directions in parameter space [21][22][23]. - The observed sparsity in RL updates is a result of low-precision parameter representations filtering out minor updates, rather than an actual lack of updates [23]. Group 4: Catastrophic Forgetting and Trade-offs - The article highlights the issue of catastrophic forgetting in SFT training, which RL training can mitigate, leading to a trade-off between learning new skills and avoiding forgetting [30][31]. - A comparison table illustrates that while RL cannot learn new capabilities, it can avoid catastrophic forgetting, suggesting a potential conflict between these two objectives [34]. - Recent research proposes a hybrid approach called On-policy Distillation, which combines elements of RL and SFT, potentially allowing for both new skill acquisition and the prevention of forgetting [36].
中游智驾厂商正在快速抢占端到端人才......
自动驾驶之心· 2025-12-15 00:04
Core Viewpoint - The article discusses the technological anxiety in intelligent driving, particularly among mid-tier manufacturers, and highlights the anticipated growth in demand for end-to-end (E2E) and VLA (Vision-Language-Action) technologies in the coming year [2]. Group 1: Industry Trends - The mass production of cutting-edge technologies like end-to-end systems is expected to begin next year, with L2 technologies becoming more standardized and moving towards lower-tier markets [2]. - The total sales of passenger vehicles priced above 200,000 are around 7 million, but leading new forces account for less than one-third of this, indicating a slow adoption of end-to-end mass production models [2]. - The maturity of end-to-end technology is seen as a precursor to larger-scale production, with the advancement of L3 regulations necessitating urgent technological upgrades among mid-tier manufacturers [2]. Group 2: Recruitment and Training - There is a growing demand for positions related to end-to-end and VLA technologies, as many professionals are seeking to quickly learn these advanced skills [3]. - The article mentions the launch of specialized courses aimed at practical applications of end-to-end and VLA technologies, designed for individuals already working in the field [3][6]. - The courses will cover various modules, including navigation information application, reinforcement learning optimization, and production experiences related to diffusion and autoregressive models [3][6]. Group 3: Course Details - The end-to-end production course will focus on practical implementation, detailing key modules and offering seven practical exercises suitable for those looking to advance their careers [3][6]. - The VLA course will cover foundational algorithms and theories, including BEV perception and large language models, with practical applications based on diffusion models and VLA algorithms [6][11]. - The instructors for these courses are experienced professionals from top-tier companies and academic institutions, ensuring a high level of expertise in the training provided [5][8][13].
复旦&上交最新!一篇长达40页的自动驾驶空间检索范式SpatialRetrievalAD
自动驾驶之心· 2025-12-15 00:04
点击下方 卡片 ,关注" 自动驾驶之心 "公众号 戳我-> 领取 自动驾驶近30个 方向 学习 路线 >>自动驾驶前沿信息获取 → 自动驾驶之心知识星球 现有自动驾驶系统非常依赖车载传感器进行实时精确的环境感知。然而,这种模式受行驶过程中的感知范围限制,在视野受限、遮挡或黑暗、降雨等极端条件下常出 现性能失效。相比之下,人类驾驶员即使在能见度不佳的情况下,仍能回忆起道路结构。为了让模型具备这种"回忆"能力,针对这个特点, 复旦可信具身智能和上交 等合作 ,将离线检索的地理图像作为额外输入引入系统。这些图像可从离线缓存(如谷歌地图或已存储的自动驾驶数据集)中轻松获取,无需额外传感器,是现有自 动驾驶任务的即插即用型扩展方案。 在实验中,首先通过谷歌地图API检索地理图像,扩展了nuScenes数据集,并将新数据与自车轨迹对齐。并在五个核心自动驾驶任务上建立了基准:目标检测、在线建 图、占用预测、端到端规划和生成式世界模型。其中在线建图mAP提升13.4%,占用预测静态类mIoU +2.57%,夜间规划碰撞率从0.55%降至0.48%,为复杂场景自动驾 驶提供低成本、高鲁棒的感知增强方案。大量实验表明,该扩展模态 ...
扒了一下今年各家具身公司的量产情况和订单金额......
自动驾驶之心· 2025-12-14 02:03
Core Insights - The article discusses the current state and future prospects of humanoid robot mass production, highlighting significant orders and developments from various companies in the industry [3][8]. Group 1: Company Developments - Hyundai Motor has committed to deploying thousands of Atlas robots in its manufacturing and logistics operations, aiming to enhance production capabilities in collaboration with Boston Dynamics [4][6]. - Yushu Technology expects annual revenue to exceed 1.2 billion, although specific order volumes for the year have not been disclosed [9]. - ZhiYuan Robotics announced a cumulative production of 5,000 robots, with applications across entertainment, manufacturing, and logistics sectors [10]. - UBTECH Robotics secured a significant order worth 264 million for its Walker S2 robot, which is designed for inspection and maintenance tasks [12]. - By November, UBTECH's Walker series had accumulated orders totaling 1.3 billion, with a monthly production capacity of 300 units and a projected delivery of over 500 units in 2025 [14]. Group 2: Strategic Partnerships and Collaborations - Shenzhen Huizhi and ZhiPing Technology announced a strategic partnership to deploy over 1,000 humanoid robots in logistics and manufacturing processes over the next three years [17]. - Stardust Intelligence formed a strategic cooperation with Shanghai Xiangong Intelligent Technology, focusing on a thousand-unit order for humanoid robots, leveraging complementary strengths in core components and applications [20][22]. - Original Force Unlimited signed a strategic cooperation agreement worth 260 million with a cultural tourism group [25]. Group 3: Market Trends and Projections - Tesla's Optimus robot is positioned as a core future asset, with a target of producing 5,000 units by the end of December 2025 and scaling up to 100,000 units by the end of 2026 [16]. - The article indicates a growing trend in the humanoid robot market, with various companies reporting significant order volumes and expanding production capabilities [8][28].
自动驾驶之心在招募业务合伙人!
自动驾驶之心· 2025-12-14 02:03
Core Viewpoint - The article emphasizes the need for collaboration and innovation in the autonomous driving industry, highlighting the importance of engaging more talented individuals to address the challenges and pain points in the sector [2]. Group 1: Industry Direction - The main focus areas in the autonomous driving field include but are not limited to: product management, 4D annotation/data loop, world models, VLA, large models for autonomous driving, reinforcement learning, and end-to-end solutions [4]. Group 2: Job Description - The positions are primarily aimed at training collaborations in autonomous driving, targeting both B-end (enterprises, universities, research institutes) and C-end (students, job seekers) audiences for course development and original content creation [5]. Group 3: Contact Information - For discussions regarding compensation and collaboration methods, interested parties are encouraged to add the WeChat contact provided for further communication [6].
2025年还存活的自动驾驶公司......
自动驾驶之心· 2025-12-14 02:03
Group 1: Industry Overview - The penetration rate of L2 autonomous driving is rapidly increasing, while L3 is on the verge of implementation and L4 is breaking through in scale [2] - The autonomous driving industry is undergoing a new round of reshuffling and resource integration, with some companies exiting the market, others merging or acquiring, and new players emerging [2] Group 2: New Forces in Autonomous Driving - Key new players in the autonomous driving sector include NIO, Xpeng, Li Auto, Xiaomi, Leap Motor, Didi, WM Motor, Niu Chuang, Zeekr, Avita, Lantu, Qianli Technology, and Jiyue [4] Group 3: Tier 1 Suppliers - Major Tier 1 suppliers in the industry consist of Huawei, Baidu, DJI, ZTE, Tencent (smart cockpit/high-precision maps/simulation toolchain), SAIC Lingxu, Jianzhi Robotics, Momenta, Bosch China, Magna, and Youjia Innovation Minieye [6] Group 4: Robotaxi Companies - Companies involved in the Robotaxi segment include Baidu, Pony.ai, Shanghai Zhaofu Intelligent Technology (Hello Robotaxi), WeRide, Didi, Momenta, Qizhou Zhihang, and Yushi Technology [8] Group 5: Robotruck Companies - Key players in the Robotruck sector are Carl Power, Zhijia Technology, Winche Technology, Pony.ai, Mainline Technology, Sien Intelligent Driving, Xijing Technology, Feibu Technology, MuYue Technology (WeRide), Zitu Technology, Changxing Intelligent, Huanyu Zhixing, Xidi Intelligent Driving, Qianhua, Xingxing, Youdao Zhitu, Karui Zhixing, Qianchen, Weidu, Geely Remote, Hengrun, Hongjing, Xidi, and Qingtian Zhika [10] Group 6: Other Autonomous Driving Applications - Companies involved in various applications of autonomous driving include Meituan, Jiushi Intelligent, JD.com, Suning, Alibaba Cainiao, China Post, Baidu Apollo, VIA Technologies, Baixiniu, Zhixingzhe, Yushi Technology, Xingshen Intelligent, Jiazhi Technology, and Xiaoshi Technology [12] - Traditional automakers in the industry include SAIC, Changan, GAC (Aion), BAIC (Extreme Fox), FAW, Great Wall, BYD, Geely (Furuitai), Dongfeng, Chery, and Geely (Zeekr) [14] - Companies focusing on agricultural autonomous driving include Fengjiang Intelligent, Zoomlion, China Yituo, Wuniu Intelligent, Zhongke Yuandong, Leiwo Heavy Industry, Chaoxing Intelligent, Bochuang Liandong, and Haoxing Technology [16] - Companies in the mining autonomous driving sector include Yikong Zhijia, Taga Zhixing, Huituo Intelligent, Lukai Zhixing, Bolai Technology, Mengshi Technology, and Qingzhi Technology [18] - Companies in the sanitation autonomous driving sector include Zhixingzhe, Kuwa, Xiantou, Gaoxian Robotics, Shenlan Technology, Haorui Intelligent, Yuwan Zhijia, and Yunchuang Zhixing [20] - Companies involved in parking solutions include Baidu, Zhuishi, Desai Xiwai, Dongsoft Ruichi, Hedu Technology, Niuli Technology, Hengrun Technology, Lingshi Technology, Moshih Intelligent, Oteming, Zhixingzhe, and Yushi Technology [22] Group 7: High-Precision Mapping - Major players in high-precision mapping include Baidu, Amap, Four-Dimensional Map New, Tencent, Huawei, Didi, JD.com, Meituan, Kuandeng, Shendong, Zhonghaiting, and Yikaton [24] Group 8: Vehicle-to-Everything (V2X) Collaboration - Companies involved in vehicle-to-everything collaboration include Mushroom Car Union, Juefei Technology, Baidu, Huawei, Datang High-Tech, Huali Zhixing, Alibaba, Hikvision, Xingyun Interconnect, and Yunjing Zhixing [24]
理想下一步的重点:从数据闭环到训练闭环
自动驾驶之心· 2025-12-14 02:03
Core Insights - The article discusses the evolution of autonomous driving technology, highlighting the transition from data closed-loop systems to training closed-loop systems, marking a new phase in autonomous driving development [18][21]. Group 1: Development of Autonomous Driving Technology - The development trajectory of Li Auto's intelligent driving has evolved from rule-based systems to AI-driven E2E+VLM dual systems and VLA, with a focus on navigation as a key module [6]. - Li Auto has accumulated 1.5 billion kilometers of driving data, utilizing over 200 triggers to produce 15-45 second clip data [11]. - The end-to-end mass production version MPI has increased to over 220, representing a 19-fold increase compared to the version from July 2024 [13]. Group 2: Data Closed-Loop and Its Limitations - The data closed-loop process includes shadow mode validation, data mining in the cloud, automatic labeling of effective samples, and model training, with data return achievable in one minute [9][10]. - Despite the effectiveness of the data closed-loop, it cannot address all issues, particularly long-tail scenarios such as traffic control and sudden lane changes [16]. Group 3: Transition to Training Closed-Loop - The core of the L4 training loop involves VLA, reinforcement learning (RL), and world models (WM), optimizing trajectories through diffusion and reinforcement learning [23]. - Key technologies for closed-loop autonomous driving training include regional simulation, synthetic data, and reinforcement learning [24]. Group 4: Advances in Reconstruction and Generation - Li Auto has made significant advancements in reconstruction and generation, with multiple top conference papers published in the past two years [28][34]. - The company has developed a feedforward 3D generation system that eliminates the need for point cloud initialization, directly producing results from visual inputs [29]. Group 5: Challenges and System Capabilities - The interactive agent is identified as a key challenge in the training closed-loop [40]. - System capabilities are enhanced by the world model providing simulation environments, diverse scene construction, and accurate feedback from reward models [41].
最近前馈GS的工作爆发了,我们做了一份学习路线图......
自动驾驶之心· 2025-12-13 02:04
Core Insights - The article highlights the advancements in 3D Gaussian Splatting (3DGS) technology, particularly its application in autonomous driving, and emphasizes the need for structured learning pathways in this rapidly evolving field [2][4]. Group 1: 3DGS Technology and Developments - Tesla's introduction of 3D Gaussian Splatting at ICCV has garnered significant attention, indicating a shift towards feed-forward GS algorithms in the industry [2]. - The rapid iteration of 3DGS technology includes static reconstruction (3DGS), dynamic reconstruction (4DGS), and surface reconstruction (2DGS), showcasing the need for effective learning resources [4]. Group 2: Course Offering - A comprehensive course titled "3DGS Theory and Algorithm Practical Tutorial" has been developed to provide a structured learning roadmap for newcomers, covering essential theories and practical applications [4]. - The course is designed to help participants understand point cloud processing, deep learning, real-time rendering, and coding practices, with a focus on hands-on experience [4]. Group 3: Course Structure - The course consists of six chapters, starting with foundational knowledge in computer graphics and progressing to advanced topics such as feed-forward 3DGS and its applications in autonomous driving [8][9][10][11][12]. - Each chapter includes practical assignments and discussions to enhance understanding and application of the concepts learned [8][9][10][11][12]. Group 4: Target Audience and Prerequisites - The course is aimed at individuals with a background in computer graphics, visual reconstruction, and programming, particularly those interested in pursuing careers in the 3DGS field [17]. - Participants are expected to have a foundational understanding of probability, linear algebra, and programming languages such as Python and PyTorch [17].
可能是第一家年销百万的新势力!?
自动驾驶之心· 2025-12-13 02:04
Core Viewpoint - Leap Motor has achieved significant sales growth and profitability, marking a turning point in its business trajectory, with ambitious targets for future sales expansion [4][16][30]. Sales Performance - In Q3, Leap Motor sold 174,000 vehicles, representing a year-on-year increase of 101.77% and a quarter-on-quarter increase of 29.63%, with monthly sales surpassing 50,000, 60,000, and 70,000 units [3][8]. - As of November 15, Leap Motor's cumulative sales for the year exceeded 500,000 units, achieving this milestone ahead of schedule [5]. Profitability and Financial Health - Leap Motor reported a net profit of 150 million yuan in Q3, following a profitable first half of the year, and maintained positive operating and free cash flow with cash reserves of 33.92 billion yuan [4][11]. - The company's revenue reached 19.45 billion yuan in Q3, a year-on-year increase of 97.3%, exceeding market expectations [11]. Product Strategy and Market Positioning - Leap Motor has established a comprehensive product matrix with four series (A, B, C, D) covering various vehicle types, successfully penetrating the market with competitive pricing and configurations [9][22]. - The B series, targeting younger consumers, has contributed significantly to sales, with the first model, B01, achieving monthly sales of over 10,000 units shortly after its launch [11]. Cost Control and Margin Improvement - Despite expanding its product lineup, Leap Motor has improved its financial performance, with the average selling price per vehicle increasing from 106,000 yuan to 112,000 yuan and gross margin rising from 13.6% to 14.5% [12]. - The gross profit for Q3 reached 2.82 billion yuan, a year-on-year increase of 45% and a quarter-on-quarter increase of 248% [12]. Future Outlook and Growth Targets - Leap Motor aims to sell 1 million vehicles in 2024, building on its current momentum and market position [5][20]. - The company plans to launch new models in the A and D series, further expanding its product offerings and targeting a broader customer base [23][26]. User Engagement and Brand Strategy - Leap Motor is focusing on understanding user needs and enhancing emotional value in its products, aiming to resonate more with consumers through design and marketing strategies [29]. - The company has accumulated 1 million users, allowing it to refine its product definitions and better meet customer demands [27].
南洋理工&哈佛提出OpenREAD:端到端RL统一认知与轨迹规划
自动驾驶之心· 2025-12-13 02:04
Core Viewpoint - The article discusses the introduction of OpenREAD, a new framework developed by Nanyang Technological University and Harvard University, which utilizes reinforcement learning (RL) to enhance the reasoning capabilities of visual language models (VLM) in the context of autonomous driving [4][28]. Group 1: Methodology - OpenREAD incorporates Qwen3-LLM as an "evaluation expert," expanding the application of RL from traditional verifiable downstream tasks to open-ended tasks such as "driving suggestions" and "scene analysis," achieving end-to-end reinforcement fine-tuning from high-level semantic reasoning to low-level trajectory planning [6][28]. - The framework addresses the challenge of designing reward functions for open-ended driving knowledge learning, where multiple expressions can represent the same reference answer, complicating the RL process [7]. - Two preparatory steps were taken: (1) Constructing knowledge data with explicit chains of thought (CoT) using GPT-4 to annotate driving knowledge data covering perception and decision-making tasks [8]; (2) Converting the OmniDrive dataset into a format suitable for RL training, structured as "thinking + answering" [9]. Group 2: Experimental Results - OpenREAD was evaluated on the LingoQA and NuScenes datasets, demonstrating superior performance compared to traditional supervised fine-tuning (SFT) methods in trajectory error, collision rates, and knowledge evaluation metrics [19][20]. - The results indicate that the introduction of driving knowledge significantly enhances the effectiveness of RL fine-tuning, as evidenced by improvements in trajectory error and collision rates [19][20]. - In comparison with existing methods, OpenREAD exhibited better collision control capabilities, ensuring safer driving outcomes [20]. Group 3: Conclusion - OpenREAD successfully implements collaborative reinforcement learning fine-tuning for driving knowledge and trajectory planning, expanding the boundaries of RL applications in end-to-end autonomous driving [28].