Workflow
端到端模型
icon
Search documents
FSD用多了会变傻:逆行闯红灯幻觉严重,50多起事故后,特斯拉被调查了
3 6 Ke· 2025-10-10 07:57
啊?AI工作时间长了,也会降智? NHTSA(美国国家公路交通安全管理局)是一点面子也不给,反手就开启了针对FSD的调查。 | | ODI RESUME | | --- | --- | | U.S. Department | Investigation: PE25012 | | of Transportation | Prompted By: VOQs, Standing General Order (SGO) reports and media | | | reports | | National | Date Opened: 10/07/2025 | | Highway | Reviewer: Scott Simmons Investigator: Thomas Haugh | | Traffic Safety | | | Administration | Approver: Tanya Topka | | | Subject: Traffic safety violations while Full Self Driving ("FSD") is | | | engaged | | | MANUFACT ...
自动驾驶Ask Me Anything问答整理!VLA和WA的路线之争?
自动驾驶之心· 2025-10-08 23:33
趁着小红书这波Ask Me Anything,跟着几位大佬学习到很多自动驾驶之心汇总了自动驾驶AMA的一些 问答,分享给大家! AMA的完整版我们已经汇总至自动驾驶之心知识星球,后续还将持续整理大佬们的 问答,欢迎加入和4000人一起交流自动驾驶最前沿~ 点击下方 卡片 ,关注" 自动驾驶之心 "公众号 戳我-> 领取 自动驾驶近30个 方向 学习 路线 小米陈龙 小米汽车自动驾驶与机器人Principal Scientist陈龙@陳龍龖龘 Q1:陈老师好!我目前是同济新大一学生(未选专业,工科可任选),未来想研究自动驾驶领域,请问您就 行业发展与人才缺口而言推荐学什么专业呢? A1:自驾有可能4年后就解决的差不多了,但AI方向肯定是没错的,所以有AI专业的话首选,没有的话就计算 机 Q2:wayne工作体验怎么样呢?感觉很多黑科技,学生时代一直在follow A2:wayve确实想得比较远,端到端,世界模型,VLA等自驾模型基本上都是产业界的开创者 Q3:请问你认为人类可以实现完全自动驾驶吗,如果可以大概还需要几年 A3:肯定可以的,L4其实Waymo、萝卜快跑已经实现了,L5可能还要至少5年 Q4:现在业 ...
自动驾驶的流派纷争史
3 6 Ke· 2025-09-28 02:50
自动驾驶的商业化落地正在全球范围内加速推进。 截至2025年5月,Waymo在美国旧金山、洛杉矶、凤凰城和奥斯汀运营的自动驾驶出租车达到1500辆,每周完成超过25万次付费出行服务;百度Apollo已 在全球部署超1000辆无人驾驶汽车,累计提供超1100万次出行服务,安全行驶里程超过1.7亿公里。 大规模落地仿佛意味着技术已经成熟,其实不然,关于自动驾驶,还有很多尚未达成共识的流派分歧。 比如,传感器方案,纯视觉和多传感器融合方案该如何抉择?系统架构上,采用模块化设计,还是拥抱新兴的端到端架构?更进一步,关于如何理解世 界,VLA和VLM孰优孰劣? 这些悬而未决的争议,正引领着自动驾驶驶向尚未完全确定的未来。而理解这些不尽相同的技术路线,就是理解自动驾驶从哪里来、到哪里去,如何实现 技术的自我进化。 眼睛之争 纯视觉vs多传感器融合 一切始于"看见"。汽车如何感知世界,是自动驾驶的基石。在这个问题上存在着对峙已久的两大阵营,且双方至今未休。 故事最早可以追溯到2004年美国莫哈韦沙漠的一场挑战赛。 那时,美国国防高级研究计划局设立了200万美元的奖金,吸引数十支顶尖高校和科研机构参与,试图解答"如何让车辆感 ...
具身智能,为何成为智驾公司的下一个战场?
雷峰网· 2025-09-26 04:17
Core Viewpoint - Embodied intelligence is emerging as the next battleground for smart driving entrepreneurs, with significant investments and developments in the sector [2][4]. Market Overview - The global embodied intelligence market is on the verge of explosion, with China's market expected to reach 5.295 billion yuan by 2025, accounting for approximately 27% of the global market [3][21]. - The humanoid robot market is projected to reach 8.239 billion yuan, representing about 50% of the global market [3]. Industry Trends - Several smart driving companies, including Horizon Robotics and Zhixing Technology, are strategically investing in embodied intelligence through mergers, acquisitions, and subsidiary establishments to seize historical opportunities [4]. - The influx of talent from the smart driving sector into embodied intelligence has been notable since 2022, with many professionals making the transition in 2023 [13]. Technological Integration - The integration of smart driving and embodied intelligence is based on the concept of "embodied cognition," where intelligent behavior is formed through continuous interaction with the physical environment [6]. - The technical pathways for both fields are highly aligned, with smart driving vehicles functioning as embodied intelligent agents through multi-sensor perception, algorithmic decision-making, and control systems [6]. Technical Framework - The technical layers of smart driving applications and their migration to embodied intelligence include: - Perception Layer: Multi-sensor fusion for environmental modeling and object recognition [7]. - Decision Layer: Path planning and behavior prediction for task planning and interaction strategies [7]. - Control Layer: Vehicle dynamics control for motion control and execution [7]. - Simulation Layer: Virtual scene testing for skill learning and adaptive training [7]. Investment and Growth Potential - The embodied intelligence market is expected to maintain a growth rate of over 40% annually, providing a valuable channel for smart driving companies facing growth bottlenecks [21]. - The dual development pattern of humanoid and specialized robots allows smart driving companies to leverage their technological strengths for market entry [22]. Profitability Insights - The gross profit margins for embodied intelligence products are generally higher than those for smart driving solutions, with professional service robots achieving margins over 50%, compared to 15-25% for autonomous driving kits [23][25]. - This profit difference arises from the stronger differentiation and lower marginal costs of embodied intelligence products, allowing for rapid market entry and reduced development costs [25]. Future Outlook - The boundaries between smart driving and embodied intelligence are increasingly blurring, with companies like Tesla viewing autonomous vehicles as "wheeled robots" and developing humanoid robots based on similar AI architectures [26]. - Early movers in this transition are likely to secure advantageous positions in the future intelligent machine ecosystem [26].
斑马智行司罗:智能座舱正经历范式重构,端到端+主动感知成破局关键
Zhong Guo Jing Ji Wang· 2025-09-22 09:07
Core Insights - The core argument presented by the CTO of Zebra Zhixing is that smart cockpits are becoming a crucial entry point for user experience and the Internet AI ecosystem in smart vehicles, representing a golden track with both technological depth and commercial value [3][4]. Industry Overview - Smart cars are identified as a significant testing ground for Physical AI, with the potential for AI value in physical spaces being more substantial than in digital realms [3]. - The smart cockpit is characterized by three core features: high complexity, high safety, and high commercial value, with Zebra Zhixing having collaborated on over 8 million vehicles to validate the feasibility of large-scale technology applications [3]. Technical Architecture - The smart cockpit's five-layer integration architecture includes: 1. Chip and computing power layer, centered around companies like NVIDIA and Qualcomm. 2. System layer, led by companies such as Zebra Zhixing and Huawei, providing efficient system-level services. 3. Large model layer, integrating general and vehicle-specific models to address multi-modal processing and data privacy. 4. Intelligent agent layer, responsible for central decision-making and service module coordination. 5. Platform service layer, enabling AI-native services through natural language interaction [4]. Development Phases - The development of smart cockpits is categorized into three phases: 1. "Verification Period" (2024 to early 2025) focusing on whether large models can be integrated into vehicles. 2. "Application Period" (2025) emphasizing the implementation of intelligent agent systems for practical service delivery. 3. "Reconstruction Period" (current to 2026) where the industry shifts from traditional assembly line architectures to end-to-end models [4][5]. Interaction Experience - The transition from a "passive response" to "active perception" in smart cockpits is highlighted, where intelligent assistants can proactively identify user needs through sensory inputs, evolving from mere tools to supportive partners [5]. - Zebra Zhixing aims to drive the smart cockpit towards a trillion-level commercial market, positioning it as a core hub in the Physical AI ecosystem [5].
黄仁勋随特朗普访英:26亿美元下注英国AI,智驾公司Wayve或获5亿美元加码
Sou Hu Cai Jing· 2025-09-20 09:57
Core Insights - NVIDIA's CEO Jensen Huang announced a £2 billion (approximately $2.6 billion) investment in the UK to catalyze the AI startup ecosystem and accelerate the creation of new companies and jobs in the AI sector [1] - Wayve, a UK-based autonomous driving startup, is expected to secure one-fifth of this investment, with NVIDIA evaluating a $500 million investment in its upcoming funding round [1][2] - Wayve's upcoming Gen 3 hardware platform will be built on NVIDIA's DRIVE AGX Thor in-vehicle computing platform [1] Company Overview - Wayve was founded in 2017 with the mission to reimagine autonomous mobility using embodied AI [3] - The company has developed a unique technology path focused on embodied AI and end-to-end deep learning models, distinguishing itself from mainstream autonomous driving companies [3][8] - Wayve is the first company in the world to deploy an end-to-end deep learning driving system on public roads [3] Technology and Innovation - Embodied AI allows an AI system to learn tasks through direct interaction with the physical environment, contrasting with traditional systems that rely on manually coded rules [8] - Wayve's end-to-end model, referred to as AV2.0, integrates deep neural networks with reinforcement learning, processing raw sensor data to output vehicle control commands [8][10] - To address the challenges of explainability in end-to-end models, Wayve developed the LINGO-2 model, which uses visual and language inputs to predict driving behavior and explain actions [10][12] Data and Training - Wayve has created the GAIA-2 world model, a video generation model designed for autonomous driving, which generates realistic driving scenarios based on structured inputs [14][15] - GAIA-2 is trained on a large dataset covering various geographical and driving conditions, allowing for effective training without extensive real-world driving data [16][17] - The model's ability to simulate edge cases enhances training efficiency and scalability [18] Strategic Partnerships - Wayve's technology does not rely on high-definition maps and is hardware-agnostic, allowing compatibility with various sensor suites and vehicle platforms [20] - The company has established partnerships with Nissan and Uber to test its autonomous driving technology [20] Leadership and Team - Wayve's leadership team includes experienced professionals from leading companies in the autonomous driving sector, enhancing its strategic direction and technological capabilities [25][26]
机器人跨越“三重门”——具身智能创新者亲历的现实与趋势
Xin Hua Wang· 2025-09-15 08:08
Core Insights - The humanoid robot industry is experiencing a dichotomy of rapid advancements in capabilities and significant challenges in commercial viability, with a notable gap between technological achievements and actual orders received [1][5][41] - Investment in humanoid robotics has surged, with over 20 companies in the sector moving towards IPOs, marking a pivotal year for mass production in humanoid robots [1][12] - The development of embodied intelligence is at a crossroads, requiring a balance between technological innovation and practical application in real-world scenarios [1][18] Group 1: Industry Developments - The first city-level operational humanoid robot demonstration zone was established in Beijing, featuring a robot-operated unmanned supermarket, indicating a significant step towards integrating humanoid robots into daily life [5] - Companies like Beijing Galaxy General Robotics are leading the way in deploying humanoid robots in various sectors, including industrial and retail applications, with plans to open 100 smart pharmacies nationwide [12][41] - The industry is witnessing a shift from merely showcasing capabilities to focusing on practical applications that can generate revenue and sustain growth [1][41] Group 2: Technological Challenges - The primary challenge for humanoid robots lies in their ability to operate autonomously without remote control, which is contingent on the development of advanced models that can generalize across different scenarios [7][13] - Data quality and diversity are critical for enhancing the capabilities of humanoid robots, with a focus on using high-quality synthetic data to train models effectively [15][33] - The current models used in humanoid robotics are not fully mature, and the industry is still grappling with the need for a unified approach to model architecture that can handle the complexities of the physical world [27][34] Group 3: Market Dynamics - The humanoid robot market is characterized by a "chicken or egg" dilemma, where the lack of orders hampers technological iteration, while insufficient technology prevents securing orders [41] - The cost of humanoid robots remains high, with individual units exceeding 100,000 yuan, making them less competitive compared to traditional labor in industrial settings [46][47] - The focus is shifting towards household applications as the ultimate goal for humanoid robots, with the belief that their true value lies in versatility and the ability to create new ecosystems [47]
π0.5开源前,国内也开源了一个强大的端到端统一基础模型!具备强泛化和长程操作
具身智能之心· 2025-09-11 02:07
Core Viewpoint - The article discusses the release of π0.5 and WALL-OSS, highlighting their advancements in embodied intelligence and the significance of these models in the robotics industry, particularly in enhancing task execution in complex environments [1][3][5]. Group 1: Model Capabilities - π0.5 demonstrates enhanced generalization capabilities through heterogeneous task collaborative training, enabling robots to perform long-term, fine-grained operations in new household environments [3][5]. - WALL-OSS achieves embodied perception through large-scale multimodal pre-training, allowing seamless integration of instruction reasoning, sub-goal decomposition, and fine-grained action synthesis within a single differentiable framework [8][18]. - The model exhibits high success rates in complex long-term manipulation tasks, showcasing robust instruction-following abilities and understanding of complex scenarios, surpassing existing baseline models [8][18][28]. Group 2: Training and Data - The training process for WALL-OSS involves discrete, continuous, and joint phases, requiring only RTX 4090-level computational power for training and inference deployment [14][15]. - A multi-source dataset centered on embodied tasks was constructed, addressing the lack of large-scale, aligned VLA supervision and current visual language models' spatial understanding gaps [20][22]. - The dataset includes thousands of hours of data, focusing on both short-range operation tasks and long-range reasoning tasks, ensuring comprehensive training for the model [20][22][24]. Group 3: Experimental Analysis - Experimental analysis on embodied visual question answering and six robotic operation tasks focused on language instruction understanding, reasoning, and generalization, as well as planning and execution of long-term, multi-stage tasks [25][31]. - WALL-OSS significantly outperformed its original baseline model in object grounding, scene captioning, and action planning tasks, demonstrating its enhanced scene understanding capabilities [27][28]. - The model's ability to follow novel instructions without task-specific fine-tuning was validated, achieving 85% average task progress on known object instructions and 61% on novel object instructions [29][31]. Group 4: Industry Impact - The advancements in WALL-OSS and π0.5 are positioned to address existing limitations in visual language models and embodied understanding, paving the way for more capable and versatile robotic systems [5][8][20]. - The company, established in December 2023, focuses on developing a general embodied intelligence model using real-world data, aiming to create robots with fine operational capabilities [39]. - The recent completion of a nearly 1 billion yuan A+ round of financing indicates strong investor confidence in the company's direction and potential impact on the industry [39].
拆解华为乾崑智驾ADS 4:世界模型乱战,尖子生如何闯关?
Core Viewpoint - The article discusses the evolution of autonomous driving technology, emphasizing the shift from traditional end-to-end models to world models that enable vehicles to understand and predict their environment more effectively [2][4][8]. Group 1: World Model Development - The world model allows vehicles to possess predictive capabilities, moving beyond mere reactive responses to real-time stimuli [2][3]. - Huawei's ADS 4 system, launched in April 2023, represents a significant advancement in high-level driving assistance, relying on the self-developed WEWA architecture [3][4]. - By 2025, several tech companies, including Xiaopeng and SenseTime, are expected to adopt world models as a crucial step towards achieving fully autonomous driving [4][8]. Group 2: Challenges in Autonomous Driving - The industry has recognized that traditional end-to-end models, which rely heavily on human driving data, often lead to suboptimal decision-making and do not truly understand physical laws [6][7]. - Research indicates that low-precision training can limit the effectiveness of models, highlighting the need for improved generalization capabilities in real-world scenarios [7]. Group 3: Competitive Landscape - Huawei's market share in the domestic pre-installed auxiliary driving domain is reported at 79.0%, maintaining its position as a leading supplier [9]. - The company differentiates itself by focusing on a more fundamental approach to driving, emphasizing spatial reasoning over merely following trends [9][10]. Group 4: Technological Innovations - Huawei's world model architecture integrates a cloud-based world engine and a vehicle-side behavior model, enhancing real-time reasoning and decision-making capabilities [12][14]. - The company has developed a unique approach to generating training scenarios, focusing on extreme cases that are often difficult to capture in real-world data [13][14]. Group 5: Implementation and Future Prospects - Huawei's intelligent driving system has been deployed in over 1 million vehicles across various manufacturers, facilitating rapid feedback and continuous improvement of the system [15]. - The integration of a large-scale real vehicle fleet supports the evolution of the driving system, paving the way for higher levels of autonomous driving capabilities [15].
拆解华为乾崑智驾ADS 4:世界模型乱战,“尖子生”如何闯关?
Core Insights - The article discusses the evolution of autonomous driving technology, emphasizing the transition from traditional models to world models that enable vehicles to predict and understand their environment rather than merely reacting to it [2][4][5]. Group 1: World Model Concept - The world model provides vehicles with the ability to anticipate and reason about their surroundings, moving beyond simple reactive capabilities [4][11]. - This model integrates vast amounts of multimodal data, including real-world driving scenarios and traffic rules, to create a dynamic and inferential digital representation of the traffic world [2][4]. - Companies like Huawei, XPeng, and SenseTime are recognizing the world model as essential for achieving true autonomous driving by 2025 [4][12]. Group 2: Technological Advancements - Huawei's ADS 4 system, launched in April 2023, marks a significant advancement in high-level driving assistance, relying on its self-developed WEWA architecture [4][12]. - The WEWA architecture consists of a cloud-based world engine (WE) for data training and scenario generation, and a vehicle-based world behavior model (WA) for real-time environmental reasoning and decision-making [4][12][21]. - The world model addresses the limitations of traditional end-to-end models, which often mimic human behavior without understanding the underlying physics of driving [6][11]. Group 3: Market Position and Competition - Huawei's market share in the domestic pre-installed advanced driving domain is reported at 79.0%, maintaining its position as a leading supplier [12][14]. - The company has successfully deployed its driving system in over 1 million vehicles across various manufacturers, enhancing its data collection and model training capabilities [24][25]. - The competitive landscape is shifting, with other companies like NIO and XPeng also exploring world models, but Huawei's approach remains distinct due to its focus on specialized behavior models rather than language-based models [18][19][22].