Workflow
自动驾驶
icon
Search documents
英伟达拿出推理版VLA:Alpamayo-R1让自动驾驶AI更会动脑子
机器之心· 2025-12-02 00:17
Group 1 - The core challenge in autonomous driving is not just perception but understanding the reasoning behind actions taken by the model [1] - Traditional end-to-end systems struggle with rare but critical scenarios, leading to potential accidents [1][2] - NVIDIA's Alpamayo-R1 introduces a reasoning capability that allows vehicles to infer causal relationships before making decisions [1][6] Group 2 - Alpamayo-R1 features a new dataset called Chain of Causation (CoC), which includes not only actions taken but also the reasons for those actions [2][3] - The model employs a diffusion-based trajectory decoder to generate feasible driving trajectories under real-time constraints [5] - A multi-stage training strategy is utilized, starting with basic mapping from vision to action, followed by supervised fine-tuning on CoC data, and concluding with reinforcement learning for optimization [6][15] Group 3 - The performance of Alpamayo-R1 shows significant improvements, particularly in long-tail scenarios where traditional models often fail [6][20] - The model's input consists of multi-camera and temporal observations, allowing for integrated multi-modal semantic understanding [8] - The CoC dataset employs a human-machine collaborative annotation mechanism, resulting in improved planning accuracy and reduced error rates [10][11] Group 4 - The training process of Alpamayo-R1 is divided into three phases: supervised fine-tuning, CoC supervision, and reinforcement learning-based post-training optimization [15][17] - The model incorporates a multi-dimensional reward mechanism to enhance reasoning accuracy and action consistency [17] - The design of AR1 represents a shift from "black box" to "white box" in autonomous driving, enabling the model to explain its decisions [19][20] Group 5 - The significance of Alpamayo-R1 lies not only in performance enhancement but also in establishing a closed loop between AI reasoning and physical actions [20][21] - The model aims to ensure safety and build trust in autonomous driving by providing explanations for its decisions [21]
Feed-forward 3DGS,正在吸引业内更多的关注......
自动驾驶之心· 2025-12-02 00:03
Core Insights - The article discusses the rapid advancements in 3D Gaussian Splatting (3DGS) technology, highlighting its significance in the field of autonomous driving and the growing interest in this area among professionals [2][4]. Group 1: Course Overview - A new course titled "3DGS Theory and Algorithm Practical Tutorial" has been developed to provide a structured learning path for individuals interested in 3DGS technology, covering both theoretical and practical aspects [4]. - The course is designed to help participants understand point cloud processing, deep learning theories, real-time rendering, and coding practices [4]. Group 2: Course Structure - The course consists of six chapters, starting with foundational knowledge in computer graphics and progressing to advanced topics such as dynamic reconstruction and surface reconstruction [8][9]. - Each chapter includes practical assignments and discussions on relevant algorithms and frameworks, such as the use of NVIDIA's open-source 3DGRUT framework [9][10]. Group 3: Target Audience and Requirements - The course is aimed at individuals with a background in computer graphics, visual reconstruction, and programming, specifically those familiar with Python and PyTorch [17]. - Participants are expected to have a GPU with a computational power of at least 4090 and a basic understanding of probability and linear algebra [17]. Group 4: Learning Outcomes - By the end of the course, participants will have a comprehensive understanding of the 3DGS technology stack, including algorithm development frameworks and the ability to train open-source models [17]. - The course also facilitates networking opportunities with peers from academia and industry, enhancing career prospects in internships and job placements [17].
超越ORION!CoT4AD:显式思维链推理VLA模型(北大最新)
自动驾驶之心· 2025-12-02 00:03
Core Insights - The article introduces CoT4AD, a new Vision-Language-Action (VLA) framework designed to enhance logical and causal reasoning capabilities in autonomous driving scenarios, addressing limitations in existing VLA models [1][3][10]. Background Review - Autonomous driving is a key research area in AI and robotics, promising improvements in traffic safety and efficiency, and playing a crucial role in smart city and intelligent transportation system development [2]. - Traditional modular architectures in autonomous driving face challenges such as error accumulation and limited generalization, leading to the emergence of end-to-end paradigms that utilize unified learning frameworks [2][3]. CoT4AD Framework - CoT4AD integrates chain-of-thought reasoning into end-to-end autonomous driving, allowing for explicit or implicit reasoning through a series of downstream tasks tailored for driving scenarios [3][10]. - The framework combines perception, language reasoning, future prediction, and trajectory planning, enabling the generation of explicit reasoning steps [6][10]. Experimental Results - CoT4AD was evaluated on the nuScenes and Bench2Drive datasets, achieving state-of-the-art performance in both open-loop and closed-loop assessments, outperforming existing LLM-based and end-to-end methods [10][19]. - In the nuScenes dataset, CoT4AD achieved L2 distance errors of 0.12m, 0.24m, and 0.53m at 1s, 2s, and 3s respectively, with an average collision rate of 0.10% [17][18]. Contributions of CoT4AD - The model's design allows for robust multi-task processing and future trajectory prediction, leveraging a diffusion model integrated with chain-of-thought reasoning [10][12]. - CoT4AD demonstrates superior performance in complex driving scenarios, enhancing decision-making consistency and reliability across diverse environments [19][23]. Ablation Studies - The effectiveness of various components, such as perception tokenizers and the chain-of-thought design, was validated through ablation studies, showing significant performance improvements when these elements were included [26][28]. - The model's ability to predict future scenarios was found to be crucial, with optimal performance achieved when predicting four future scenarios [29]. Conclusion - CoT4AD represents a significant advancement in autonomous driving technology, demonstrating enhanced reasoning capabilities and superior performance compared to existing methods, while also highlighting areas for future research to improve computational efficiency [30][32].
特斯拉为什么现在不选择VLA?
自动驾驶之心· 2025-12-02 00:03
Core Insights - The article discusses Tesla's latest Full Self-Driving (FSD) technology, questioning whether its architecture is outdated compared to the emerging VLA (Vision-Language-Action) framework used in robotics [3][4]. Comparison of Robotics and Autonomous Driving - **Task Objectives**: Robotics can execute any human command, while autonomous driving focuses on navigation from point A to B, relying on map data for precision [4]. - **Operating Environment**: Autonomous driving operates on defined roads with fewer complex tasks, making it less reliant on language processing compared to robotics [4]. - **Hardware Limitations**: Current hardware lacks sufficient processing power (under 1000 TOPS), making it challenging to implement large language models for driving tasks, which could compromise safety [5]. Tesla's Approach - Tesla employs a hybrid logic of fast and slow thinking, primarily using an end-to-end approach for most scenarios, while only utilizing VLM in specific situations like traffic regulations or unstructured road conditions [5].
三战港交所,“大疆教父”托举90亿矿卡龙头
Sou Hu Cai Jing· 2025-12-01 14:01
出品 | 创业最前线 魏帅 西北寒风凛冽的矿区内,无人驾驶矿用卡车与传统重卡共同组成的编队,在漫天黄沙中并肩作业,这是目前全球规模最大的无人驾驶混合作业编队。 而远在2000公里外的香港港交所,这些无人驾驶"大家伙"的创造者希迪智驾,正进行第三次也是最为关键的上市搏击。 2024年11月、2025年5月以及2025年11月,这家由"大疆教父"李泽湘创立的自动驾驶公司三度向港交所递交上市申请。 与西北矿区壮观的作业场景形成鲜明对比的是其触目惊心的财务表现:2022至2024年三年累计亏损已超过11亿元。 创业八年,希迪智驾的团队也确实做到了行业翘楚:打造出了中国首个完全无人驾驶纯电矿卡车队,并实现全球最大规模混合编组作业。 文章开头的一幕,并非是特殊的场景,而是希迪智驾已经实践了的作业案例。 但硬科技不等于商业模式,任何技术都需要为场景和商业化服务。截至2025年6月30日,公司账面现金和现金等价物为1.86亿元,而近半年亏损就达4.55亿 元。按此"烧钱速度",其现金储备仅能维持数月。 过去很长一段时间里,这家公司一直都在践行李泽湘的技术逻辑和系统构建模式,在技术突破上实现了领先,并且形成了"自动驾驶+车路协 ...
上海闵行区330公里自动驾驶测试道路即将开放
人民财讯12月1日电,据上海发布,闵行区新闻办介绍,秉持"安全可控、稳步推进、先行示范"的原 则,闵行区系统推动自动驾驶测试道路开放,即将向社会解锁124条、约330公里测试路段,全力打造多 层次、多功能的智能出行新高地。 ...
滴滴自动驾驶在广州试运行全天候、全无人Robotaxi服务
Core Viewpoint - Didi Autonomous Driving has launched a 24/7 unmanned Robotaxi trial in designated areas of Guangzhou, marking a significant step in the deployment of autonomous vehicle technology in urban settings [1] Group 1: Company Developments - The trial is taking place in the Huangpu core living area, which includes high-frequency travel locations such as subway stations, schools, shopping centers, office buildings, and residential communities [1] - The service operates seven days a week, providing continuous availability for users [1] Group 2: Industry Implications - This initiative reflects the growing trend of integrating autonomous vehicles into urban transportation systems, potentially transforming mobility solutions in major cities [1] - The focus on high-frequency travel areas indicates a strategic approach to maximize user engagement and operational efficiency [1]
L4自动驾驶及国际化的领导者 文远知行获美银首次覆盖及“买入”评级
Ge Long Hui· 2025-12-01 10:48
Group 1 - The core viewpoint of the report is that Bank of America initiates coverage on WeRide with a "Buy" rating, setting a target price of $12 for US shares and HK$31 for Hong Kong shares, indicating potential upside of 45.6% and 50.0% respectively based on current prices [1][2] - WeRide has established a significant first-mover advantage and a solid partnership network in the overseas Robotaxi business, which is expected to enhance its profitability in the domestic market through improved economies of scale [1] - The company is expanding its fleet under the WeRide One universal technology platform, which includes Robobus, Robovan, and Robosweeper, supporting future business growth trends [1] Group 2 - The L4 and above autonomous driving industry market is projected to continue growing, with WeRide's revenue expected to reach approximately RMB 14.6 billion by 2030, driven by high growth trends [2] - WeRide's Robotaxi fleet is anticipated to expand globally, reaching around 61,000 units by 2030, benefiting from the high profitability of its overseas operations [2] - The company has initiated a partnership with Uber to launch a fully autonomous Robotaxi service in Abu Dhabi by November 2025, receiving the first city-level permit for such operations outside the US [2] - WeRide has also been granted permission to operate fully autonomous services on public roads in Furttal, Switzerland, marking the first such license issued in Switzerland, with plans to open to the public in the first half of 2026 [2]
美股异动丨小马智行盘前一度涨超3.5%,绩后获美银上调目标价及收入预测
Ge Long Hui· 2025-12-01 09:48
Core Viewpoint - Pony.ai (PONY.US) reported strong third-quarter earnings, with total revenue of 181 million yuan, representing a 72% year-over-year increase, marking three consecutive quarters of revenue growth [1] Revenue Performance - The Robotaxi business generated revenue of 47.7 million yuan, showing an impressive year-over-year growth of 89.5%, with passenger fare income increasing by over 200% [1] - The seventh-generation Robotaxi achieved single-vehicle profitability in Guangzhou [1] Analyst Ratings and Forecasts - Bank of America raised its revenue forecasts for Pony.ai for 2025 to 2027 by 5%, 4%, and 9% respectively, and increased the target stock price from $20 to $21, maintaining a "Buy" rating [1] - Everbright Securities also upheld a "Buy" rating, optimistic about the new generation models driving the core Robotaxi business growth and the gradual realization of profitability [1]
上市不办庆功宴,文远知行韩旭砸500万顶薪抢人:人才才是穿越周期的核心
Jin Tou Wang· 2025-12-01 07:27
Core Viewpoint - The CEO of Wenyan Zhixing, Han Xu, emphasizes that talent acquisition is more critical than celebrating the company's dual listing in the US and Hong Kong, stating that "listing is not the end, talent is the core to traverse cycles" [1] Group 1: Talent Acquisition Strategy - The company has launched a "Talent Plan" with starting salaries of 3 million and a cap of 5 million, aiming to attract top talent [1] - Han Xu believes that recruiting the best people is essential for the company to become a leading enterprise, advocating for a nurturing environment rather than restrictive management [2] - The company has established a fair and transparent evaluation system, allowing talent to focus on core challenges without redundant management [2] Group 2: Business and Market Position - Wenyan Zhixing operates in over 30 cities across 11 countries and holds autonomous driving licenses in eight countries, providing a comprehensive practice scenario from technology development to commercialization [2] - The autonomous driving industry is entering a critical growth phase, with major players like Tesla, Huawei, and Xiaopeng entering the market, positioning Wenyan Zhixing among the top tier of the industry [2] - The company’s competitive edge in the autonomous driving sector offers significant growth opportunities for talent, allowing them to participate in revolutionary changes in human mobility [3] Group 3: Company Culture and Growth Environment - The dual listing enhances the company's recruitment advantages, ensuring that the commitment to talent development is backed by solid guarantees [2] - The company fosters an open and inclusive culture, which is crucial for attracting top talent who seek to grow and excel in hard technology [3] - The starting salary of 3 million is seen as a sincere commitment, while the opportunity for practical growth in a thriving industry is the core attraction for global top talent [3]