端到端模型
Search documents
在地平线搞自动驾驶的这三年
自动驾驶之心· 2025-11-24 00:03
作者 | candywisdom 编辑 | 自动驾驶之心 原文链接: https://zhuanlan.zhihu.com/p/1970953355355469364 点击下方 卡片 ,关注" 自动驾驶之心 "公众号 戳我-> 领取 自动驾驶近30个 方向 学习 路线 >>自动驾驶前沿信息获取 → 自动驾驶之心知识星球 本文只做学术分享,如有侵权,联系删文 从自动驾驶转到具身智能已经有一年的时间了,之前在自动驾驶上一系列工作和一些个人思考还一直没有好好的做个总结。(Ps: 虽然广义来说,自动驾驶属于具身智 能的子领域,但是现阶段二者所面临的问题和解决问题的具体方式还是存在较大差异,所以还是算是进入了一个转向了一个新的方向。) 可预期的短时间内,主要精力投入应该不会放在自动驾驶上了,但总觉得该给自动驾驶的这段经历留个记录。倒不是说这些工作多"惊天动地",反而有些是"关注度不 高但挺实在"的探索,它们可能没上过热搜,但个人认为其确确实实解决过实际问题,希望可以给做相关方向的朋友提供点参考。 1. 从目标检测开始逐步往端到端planning拓展,构建一个强有力的端侧policy; 2. 针对端到端模型的闭环评测和训 ...
理想主动安全负责人发文《主动安全之死》
理想TOP2· 2025-11-20 16:15
Group 1 - The core relationship between active safety and assisted driving is that both rely on similar underlying technologies to enhance user driving experience, with active safety focusing on preventing collisions regardless of who is driving [2][3] - Active safety aims to prevent accidents by providing alerts and taking control of the vehicle when necessary, while assisted driving systems follow navigation to transport users safely and efficiently [2][3] - The necessity of LiDAR in active safety is emphasized, as it significantly enhances safety by compensating for human limitations in various driving conditions [5][6] Group 2 - The active safety field has been expanding to cover high-frequency and high-risk driving scenarios over the past decade, but there are concerns about whether the current enumeration of accident scenarios is sufficient [7][8] - The complexity of real-world driving scenarios poses challenges for rule-based systems, which may struggle to account for unpredictable events [10][11] - The transition to model-based approaches in active safety could address these challenges by providing more effective responses to complex situations [15] Group 3 - The concept of "the death of active safety" is introduced, suggesting that as driving becomes safer through optimization and the advent of higher-level autonomous driving, the need for active safety may diminish [16] - Despite these challenges, the industry remains committed to improving active safety technologies, with a belief that advancements will lead to significant changes in the next few years [18] - The focus is shifting from competition to collaboration in creating a safer future, with ongoing efforts to reduce the probability and severity of accidents [18]
理想VLM/VLA盲区减速差异
理想TOP2· 2025-10-18 08:44
Core Insights - The article discusses the differences between VLM (Visual Language Model) and VLA (Visual Language Action) in the context of autonomous driving, particularly focusing on scenarios like blind spot deceleration [1][2]. Group 1: VLM and VLA Differences - VLM operates by perceiving scenarios such as uncontrolled intersections and outputs a deceleration request to the E2E (End-to-End) model, which then reduces speed to 8-12 km/h, creating a sense of disconnection in the response [2]. - VLA, on the other hand, utilizes a self-developed base model to understand the scene directly, allowing for a more nuanced approach to blind spot deceleration, resulting in a smoother and more contextually appropriate response based on various road conditions [2]. Group 2: Action Mechanism - The action generated by VLA is described as a more native deceleration action rather than a dual-system command, indicating a more integrated approach to scene understanding and response [3]. - There are concerns raised in the comments regarding VLM's reliability as an external module, questioning its ability to accurately interpret 3D space and the stability of its triggering mechanisms [3].
FSD用多了会变傻:逆行闯红灯幻觉严重,50多起事故后,特斯拉被调查了
3 6 Ke· 2025-10-10 07:57
Core Viewpoint - The article discusses a new investigation by the National Highway Traffic Safety Administration (NHTSA) into Tesla's Full Self-Driving (FSD) system, raising concerns about its potential to cause traffic safety violations and accidents, particularly the risk of "diminished intelligence" with prolonged use of the system [1][6]. Investigation Details - NHTSA has opened an investigation into FSD, prompted by user complaints and media reports, focusing on traffic safety violations while FSD is engaged [2]. - The investigation covers approximately 2,882,566 Tesla vehicles equipped with FSD, which could lead to a recall if issues are confirmed [2][10]. Types of Violations - The investigation highlights two main types of violations: 1. Ignoring red lights, with 18 complaints confirmed, including 4 incidents resulting in injuries [2][3]. 2. Incorrect lane usage, such as entering oncoming traffic lanes or ignoring road signs, with another 18 complaints reported [3][10]. Incident Reports - A total of 58 reports of FSD-related traffic safety violations have been documented, resulting in 23 injuries [3][8]. - Notably, a testing agency found that while FSD performed well in initial tests, issues arose after extended use, leading to dangerous situations [6][11]. System Evaluation - NHTSA's review will assess the FSD system's ability to warn users of upcoming actions, response times, and its recognition of traffic signals and lane markings [10]. - The investigation will also evaluate whether over-the-air updates affect FSD's compliance with traffic laws [10]. Historical Context - This investigation is part of a series of ongoing inquiries into Tesla's FSD, with previous investigations addressing various incidents and compliance issues [13]. - The typical duration for such investigations is at least 18 months, indicating a slow regulatory response to rapidly evolving AI technology [15]. Future Implications - The outcome of the investigation may not significantly impact Tesla, as the company has historically navigated regulatory challenges effectively [11][15]. - The evolving nature of AI technology poses challenges for traditional regulatory frameworks, which may struggle to keep pace with advancements in systems like FSD [15].
自动驾驶Ask Me Anything问答整理!VLA和WA的路线之争?
自动驾驶之心· 2025-10-08 23:33
Core Insights - The article discusses the current state and future prospects of autonomous driving technology, emphasizing the importance of AI and various modeling approaches in achieving higher levels of automation [4][6][9]. Group 1: Industry Development - The autonomous driving industry is rapidly evolving, with significant advancements expected in the next few years, particularly in AI and related fields [4]. - Companies like Waymo and Tesla are leading the way in achieving Level 4 (L4) automation, while Level 5 (L5) may take at least five more years to realize [4][6]. - The integration of Vision-Language Models (VLA) is seen as a key to enhancing decision-making capabilities in autonomous vehicles, addressing long-tail problems that pure end-to-end models may struggle with [6][9]. Group 2: Technical Approaches - The article outlines different modeling approaches in autonomous driving, including end-to-end models and the emerging VLA paradigm, which combines language processing with visual data to improve reasoning and decision-making [5][9]. - The effectiveness of current autonomous driving systems is still limited, with many challenges remaining in achieving full compliance with traffic regulations and safety standards [10][14]. - The discussion highlights the importance of data and cloud computing capabilities in narrowing the performance gap between domestic companies and leaders like Tesla [14][15]. Group 3: Talent and Education - There is a recognized talent gap in the autonomous driving sector, with a strong recommendation for students to pursue AI and computer science to prepare for future opportunities in the industry [4][6]. - The article suggests that practical experience in larger autonomous driving companies may provide better training and growth opportunities compared to smaller robotics firms [16][20].
自动驾驶的流派纷争史
3 6 Ke· 2025-09-28 02:50
Core Insights - The commercialization of autonomous driving is accelerating globally, with companies like Waymo and Baidu Apollo significantly increasing their fleets and service offerings [1][2] - Despite the apparent maturity of technology, there are still unresolved debates regarding sensor solutions and system architectures that will shape the future of autonomous driving [3][4] Sensor Solutions - There are two main camps in the sensor debate: pure vision and multi-sensor fusion, each with its own advantages and challenges [4][9] - The pure vision approach, championed by Tesla, relies on cameras and deep learning algorithms, offering lower costs and scalability, but struggles in adverse weather conditions [7][9] - Multi-sensor fusion, favored by companies like Waymo and NIO, emphasizes safety through redundancy, combining various sensors to enhance reliability [9][10] Sensor Types - LiDAR is known for its high precision in creating 3D point clouds but comes with high costs, making it less accessible for mass commercialization [11][13] - 4D millimeter-wave radar offers advantages in adverse weather conditions but lacks the resolution of LiDAR, leading to a complementary relationship between the two technologies [13][15] Algorithmic Approaches - The industry is divided between modular and end-to-end algorithm designs, with the latter gaining traction for its potential to optimize performance without information loss [16][18] - End-to-end models, while promising, face challenges related to traceability and safety, leading to the emergence of hybrid approaches that seek to balance performance and explainability [18][22] AI Models - The debate continues between Visual Language Models (VLM) and Visual Language Action Models (VLA), with VLM focusing on interpretability and VLA on performance optimization [19][21] - VLM is currently more widely adopted among major companies due to its maturity and lower training costs, while VLA is explored by companies like Tesla and Geely for its advanced reasoning capabilities [25][26] Industry Trends - The ongoing technological debates are leading to a convergence of ideas, with sensor technologies and algorithmic approaches increasingly integrating to enhance the capabilities of autonomous driving systems [25][26]
具身智能,为何成为智驾公司的下一个战场?
雷峰网· 2025-09-26 04:17
Core Viewpoint - Embodied intelligence is emerging as the next battleground for smart driving entrepreneurs, with significant investments and developments in the sector [2][4]. Market Overview - The global embodied intelligence market is on the verge of explosion, with China's market expected to reach 5.295 billion yuan by 2025, accounting for approximately 27% of the global market [3][21]. - The humanoid robot market is projected to reach 8.239 billion yuan, representing about 50% of the global market [3]. Industry Trends - Several smart driving companies, including Horizon Robotics and Zhixing Technology, are strategically investing in embodied intelligence through mergers, acquisitions, and subsidiary establishments to seize historical opportunities [4]. - The influx of talent from the smart driving sector into embodied intelligence has been notable since 2022, with many professionals making the transition in 2023 [13]. Technological Integration - The integration of smart driving and embodied intelligence is based on the concept of "embodied cognition," where intelligent behavior is formed through continuous interaction with the physical environment [6]. - The technical pathways for both fields are highly aligned, with smart driving vehicles functioning as embodied intelligent agents through multi-sensor perception, algorithmic decision-making, and control systems [6]. Technical Framework - The technical layers of smart driving applications and their migration to embodied intelligence include: - Perception Layer: Multi-sensor fusion for environmental modeling and object recognition [7]. - Decision Layer: Path planning and behavior prediction for task planning and interaction strategies [7]. - Control Layer: Vehicle dynamics control for motion control and execution [7]. - Simulation Layer: Virtual scene testing for skill learning and adaptive training [7]. Investment and Growth Potential - The embodied intelligence market is expected to maintain a growth rate of over 40% annually, providing a valuable channel for smart driving companies facing growth bottlenecks [21]. - The dual development pattern of humanoid and specialized robots allows smart driving companies to leverage their technological strengths for market entry [22]. Profitability Insights - The gross profit margins for embodied intelligence products are generally higher than those for smart driving solutions, with professional service robots achieving margins over 50%, compared to 15-25% for autonomous driving kits [23][25]. - This profit difference arises from the stronger differentiation and lower marginal costs of embodied intelligence products, allowing for rapid market entry and reduced development costs [25]. Future Outlook - The boundaries between smart driving and embodied intelligence are increasingly blurring, with companies like Tesla viewing autonomous vehicles as "wheeled robots" and developing humanoid robots based on similar AI architectures [26]. - Early movers in this transition are likely to secure advantageous positions in the future intelligent machine ecosystem [26].
斑马智行司罗:智能座舱正经历范式重构,端到端+主动感知成破局关键
Zhong Guo Jing Ji Wang· 2025-09-22 09:07
Core Insights - The core argument presented by the CTO of Zebra Zhixing is that smart cockpits are becoming a crucial entry point for user experience and the Internet AI ecosystem in smart vehicles, representing a golden track with both technological depth and commercial value [3][4]. Industry Overview - Smart cars are identified as a significant testing ground for Physical AI, with the potential for AI value in physical spaces being more substantial than in digital realms [3]. - The smart cockpit is characterized by three core features: high complexity, high safety, and high commercial value, with Zebra Zhixing having collaborated on over 8 million vehicles to validate the feasibility of large-scale technology applications [3]. Technical Architecture - The smart cockpit's five-layer integration architecture includes: 1. Chip and computing power layer, centered around companies like NVIDIA and Qualcomm. 2. System layer, led by companies such as Zebra Zhixing and Huawei, providing efficient system-level services. 3. Large model layer, integrating general and vehicle-specific models to address multi-modal processing and data privacy. 4. Intelligent agent layer, responsible for central decision-making and service module coordination. 5. Platform service layer, enabling AI-native services through natural language interaction [4]. Development Phases - The development of smart cockpits is categorized into three phases: 1. "Verification Period" (2024 to early 2025) focusing on whether large models can be integrated into vehicles. 2. "Application Period" (2025) emphasizing the implementation of intelligent agent systems for practical service delivery. 3. "Reconstruction Period" (current to 2026) where the industry shifts from traditional assembly line architectures to end-to-end models [4][5]. Interaction Experience - The transition from a "passive response" to "active perception" in smart cockpits is highlighted, where intelligent assistants can proactively identify user needs through sensory inputs, evolving from mere tools to supportive partners [5]. - Zebra Zhixing aims to drive the smart cockpit towards a trillion-level commercial market, positioning it as a core hub in the Physical AI ecosystem [5].
黄仁勋随特朗普访英:26亿美元下注英国AI,智驾公司Wayve或获5亿美元加码
Sou Hu Cai Jing· 2025-09-20 09:57
Core Insights - NVIDIA's CEO Jensen Huang announced a £2 billion (approximately $2.6 billion) investment in the UK to catalyze the AI startup ecosystem and accelerate the creation of new companies and jobs in the AI sector [1] - Wayve, a UK-based autonomous driving startup, is expected to secure one-fifth of this investment, with NVIDIA evaluating a $500 million investment in its upcoming funding round [1][2] - Wayve's upcoming Gen 3 hardware platform will be built on NVIDIA's DRIVE AGX Thor in-vehicle computing platform [1] Company Overview - Wayve was founded in 2017 with the mission to reimagine autonomous mobility using embodied AI [3] - The company has developed a unique technology path focused on embodied AI and end-to-end deep learning models, distinguishing itself from mainstream autonomous driving companies [3][8] - Wayve is the first company in the world to deploy an end-to-end deep learning driving system on public roads [3] Technology and Innovation - Embodied AI allows an AI system to learn tasks through direct interaction with the physical environment, contrasting with traditional systems that rely on manually coded rules [8] - Wayve's end-to-end model, referred to as AV2.0, integrates deep neural networks with reinforcement learning, processing raw sensor data to output vehicle control commands [8][10] - To address the challenges of explainability in end-to-end models, Wayve developed the LINGO-2 model, which uses visual and language inputs to predict driving behavior and explain actions [10][12] Data and Training - Wayve has created the GAIA-2 world model, a video generation model designed for autonomous driving, which generates realistic driving scenarios based on structured inputs [14][15] - GAIA-2 is trained on a large dataset covering various geographical and driving conditions, allowing for effective training without extensive real-world driving data [16][17] - The model's ability to simulate edge cases enhances training efficiency and scalability [18] Strategic Partnerships - Wayve's technology does not rely on high-definition maps and is hardware-agnostic, allowing compatibility with various sensor suites and vehicle platforms [20] - The company has established partnerships with Nissan and Uber to test its autonomous driving technology [20] Leadership and Team - Wayve's leadership team includes experienced professionals from leading companies in the autonomous driving sector, enhancing its strategic direction and technological capabilities [25][26]
机器人跨越“三重门”——具身智能创新者亲历的现实与趋势
Xin Hua Wang· 2025-09-15 08:08
Core Insights - The humanoid robot industry is experiencing a dichotomy of rapid advancements in capabilities and significant challenges in commercial viability, with a notable gap between technological achievements and actual orders received [1][5][41] - Investment in humanoid robotics has surged, with over 20 companies in the sector moving towards IPOs, marking a pivotal year for mass production in humanoid robots [1][12] - The development of embodied intelligence is at a crossroads, requiring a balance between technological innovation and practical application in real-world scenarios [1][18] Group 1: Industry Developments - The first city-level operational humanoid robot demonstration zone was established in Beijing, featuring a robot-operated unmanned supermarket, indicating a significant step towards integrating humanoid robots into daily life [5] - Companies like Beijing Galaxy General Robotics are leading the way in deploying humanoid robots in various sectors, including industrial and retail applications, with plans to open 100 smart pharmacies nationwide [12][41] - The industry is witnessing a shift from merely showcasing capabilities to focusing on practical applications that can generate revenue and sustain growth [1][41] Group 2: Technological Challenges - The primary challenge for humanoid robots lies in their ability to operate autonomously without remote control, which is contingent on the development of advanced models that can generalize across different scenarios [7][13] - Data quality and diversity are critical for enhancing the capabilities of humanoid robots, with a focus on using high-quality synthetic data to train models effectively [15][33] - The current models used in humanoid robotics are not fully mature, and the industry is still grappling with the need for a unified approach to model architecture that can handle the complexities of the physical world [27][34] Group 3: Market Dynamics - The humanoid robot market is characterized by a "chicken or egg" dilemma, where the lack of orders hampers technological iteration, while insufficient technology prevents securing orders [41] - The cost of humanoid robots remains high, with individual units exceeding 100,000 yuan, making them less competitive compared to traditional labor in industrial settings [46][47] - The focus is shifting towards household applications as the ultimate goal for humanoid robots, with the belief that their true value lies in versatility and the ability to create new ecosystems [47]