端到端

Search documents
理想的VLA可以类比DeepSeek的MoE
理想TOP2· 2025-06-08 04:24
Core Viewpoint - The article discusses the advancements and innovations in the VLA (Vision Language Architecture) and its comparison with DeepSeek's MoE (Mixture of Experts), highlighting the unique approaches and improvements in model architecture and training processes. Group 1: VLA and MoE Comparison - Both VLA and MoE have been previously proposed concepts but are now being fully realized in new domains with significant innovations and positive outcomes [2] - DeepSeek's MoE has improved upon traditional models by increasing the number of specialized experts and enhancing parameter utilization through Fine-Grained Expert Segmentation and Shared Expert Isolation [2] Group 2: Key Technical Challenges for VLA - The VLA needs to address six critical technical points, including the design and training processes, 3D spatial understanding, and real-time inference capabilities [4] - The design of the VLA base model requires a focus on sparsity to expand parameter capacity without significantly increasing inference load [6] Group 3: Model Training and Efficiency - The training process incorporates a significant amount of 3D data and driving-related information while reducing the proportion of historical data [7] - The model is designed to learn human thought processes, utilizing both fast and slow reasoning methods to balance parameter scale and real-time performance [8] Group 4: Diffusion and Trajectory Generation - Diffusion techniques are employed to decode action tokens into driving trajectories, enhancing the model's ability to predict complex traffic scenarios [9] - The use of an ODE sampler accelerates the diffusion generation process, allowing for stable trajectory generation in just 2-3 steps [11] Group 5: Reinforcement Learning and Model Training - The system aims to surpass human driving capabilities through reinforcement learning, addressing previous limitations related to training environments and information transfer [12] - The model has achieved end-to-end trainability, enhancing its ability to generate realistic 3D environments for training [12] Group 6: Positioning Against Competitors - The company is no longer seen as merely following Tesla in the autonomous driving space, especially since the introduction of V12, which marks a shift in its approach [13] - The VLM (Vision Language Model) consists of fast and slow systems, with the fast system being comparable to Tesla's capabilities, while the slow system represents a unique approach due to resource constraints [14] Group 7: Evolution of VLM to VLA - The development of VLM is viewed as a natural evolution towards VLA, indicating that the company is not just imitating competitors but innovating based on its own insights [15]
2025中国高阶智能辅助驾驶最新技术洞察:算力跃迁、数据闭环、VLA与世界模型
EqualOcean· 2025-06-05 05:42
Investment Rating - The report does not explicitly state an investment rating for the industry Core Insights - The report highlights the evolution of advanced driver assistance systems (ADAS) in China, focusing on the expansion of operational design domains (ODD), technological equity, safety concerns, and supportive policies [4][21][23] - It emphasizes the need for algorithm, data, and computing power upgrades to address safety shortcomings in high-level ADAS technologies [23][66] - The report discusses the transition from modular to end-to-end architectures in vehicle algorithms, aiming for human-like driving capabilities [66][68] Summary by Sections 1. Market Background - The expansion of high-level ADAS ODD is noted, with a focus on technological inclusivity and addressing accident anxiety through safety redundancies [4][21] - Policy support is highlighted as crucial for rational promotion of ADAS technologies [4][21] 2. Technology Insights - The report decodes the underlying logic of data, algorithms, and computing power in high-level ADAS [4][28] - It discusses the computing power landscape, noting the shift towards higher TOPS (trillions of operations per second) capabilities in vehicle and cloud computing [42][44] - Data challenges, including collection and positioning technologies, are identified as critical areas for development [4][28] 3. Competitive Analysis - The competitive landscape is analyzed, detailing the tiered structure of companies and their development strategies [29][30] - The report outlines various collaboration models among automotive manufacturers and technology providers, emphasizing the balance between self-research and external sourcing [83] 4. Trend Insights - The report notes the commercialization progress of passenger vehicle L3 systems, indicating a growing market for advanced ADAS [31][32] - It highlights the importance of continuous upgrades and iterations in ADAS functionalities to meet evolving consumer expectations and safety standards [82][83]
小米辅助驾驶再迎大将,前一汽南京CTO陈光加入|36氪独家
3 6 Ke· 2025-05-30 04:50
Core Insights - Xiaomi has appointed Chen Guang, former CTO of FAW Nanjing Research Institute, as the head of perception for its autonomous driving division, indicating a strategic move to enhance its capabilities in this area [1][2] - The company is focusing on developing an "end-to-end" autonomous driving solution, which integrates perception, prediction, and planning into a unified deep learning model, differentiating itself from traditional rule-based approaches [1][4] - Xiaomi's autonomous driving team has grown to 1,200 members, reflecting its commitment to building a robust workforce in this competitive sector [3] Company Developments - Chen Guang's previous experience includes leading the development of FAW Hongqi's third-generation L4-level fully autonomous Robotaxi, showcasing his expertise in the field [1] - Xiaomi's autonomous driving team is divided into two main groups: "end-to-end" algorithms and technology research, with a focus on advancing the functionality of their systems [1][2] - The company is also exploring the development of a next-generation VLA (Vision-Language-Action) model, which is expected to be launched within the year [5] Industry Context - The autonomous driving industry is witnessing a shift from rule-based systems to "end-to-end" solutions, as exemplified by Tesla's Full Self-Driving (FSD) approach, which Xiaomi is now adopting [4] - Xiaomi's push for advanced autonomous driving technology comes amid significant challenges, including a recent traffic accident involving one of its vehicles, which has raised safety concerns and public scrutiny [5] - The company aims to alleviate safety doubts by equipping its second vehicle, the YU7, with high-performance hardware, including a 4nm NVIDIA Thor chip and multiple sensors, to enhance its autonomous driving capabilities [5]
智驾的遮羞布被掀开
Hu Xiu· 2025-05-26 02:47
Core Insights - The automotive industry is transitioning towards more advanced autonomous driving technologies, moving beyond the simplistic "end-to-end" models that have been prevalent [2][3][25] - Companies are exploring new architectures and models, such as VLA and world models, to address the limitations of current systems and enhance safety and reliability in autonomous driving [4][14][25] Group 1: Industry Trends - Major players like Huawei, Li Auto, and Xpeng are developing unique software architectures to improve autonomous driving capabilities, indicating a shift towards more complex systems [4][5][14] - The introduction of new terminologies and models reflects a diversification in approaches to autonomous driving, with no clear standard emerging [4][25] - The industry is witnessing a split in technological pathways, with some companies focusing on L3 capabilities while others remain at L2, leading to a potential widening of the technology gap [25][26] Group 2: Data Challenges - The demand for high-quality data is critical for training large models in the new phase of autonomous driving, but companies face challenges in acquiring and annotating sufficient real-world data [15][22] - Companies are increasingly turning to simulation and AI-generated data to overcome data scarcity, with some suggesting that simulated data may become more important than real-world data in the future [22][23] Group 3: Competitive Landscape - The competition is intensifying as companies with self-developed capabilities advance towards more complex technologies, while others may rely on suppliers, leading to a concentration of orders among a few capable suppliers [26][27] - The shift towards L3 capabilities will require companies to focus not only on technology but also on operational aspects, as the responsibility for safety and maintenance will shift from users to manufacturers [25][26]
AI 如何成为理想一号工程
晚点LatePost· 2025-05-23 07:41
Core Viewpoint - The article discusses Li Auto's strategic focus on artificial intelligence (AI) and its evolution from a vehicle-centric AI assistant to a multi-platform intelligent application, emphasizing the importance of AI in future competitiveness [4][5][6]. Group 1: Strategic Meetings and AI Prioritization - Li Auto holds biannual closed-door strategy meetings to discuss future directions, with significant participation from top executives and industry leaders [3]. - Following a strategic meeting, Li Auto adjusted its AI-related business priorities, emphasizing the strategic importance of intelligent driving over other AI applications [4][5]. - The company aims to become a global leader in AI by 2030, with a clear focus on enhancing its AI capabilities and applications [5][6]. Group 2: Development of AI Capabilities - Li Auto has transitioned its AI assistant, "Li Xiang," from a vehicle-only application to a multi-platform tool, including mobile and web applications [7]. - The company has invested in self-developed algorithms, achieving a full switch to in-house technology for its AI functionalities by March 2023 [7][8]. - The introduction of the multi-modal cognitive model, Mind GPT 1.0, marks a significant advancement in Li Auto's AI capabilities [7]. Group 3: Intelligent Driving and Technological Advancements - Li Auto's intelligent driving system, AD Max, was launched to address product shortcomings and enhance competitive positioning in the market [10][11]. - The company has initiated a large-scale recruitment drive for its intelligent driving team, reflecting its commitment to advancing this technology [10]. - The shift towards an "end-to-end" model for intelligent driving aims to streamline processes and improve system performance through better data utilization [10][11]. Group 4: Organizational Changes and AI Integration - Li Auto established an AI Technical Committee to integrate AI capabilities across various business lines, enhancing collaboration and execution [15][16]. - The committee includes leaders from key departments, ensuring that AI is a core focus in strategic decision-making [16][17]. - The company aims to develop a foundational model that serves as a core capability for all AI projects, positioning itself as a leader in the automotive AI landscape [17].
AI 如何成为理想一号工程
晚点Auto· 2025-05-22 07:16
Core Viewpoint - The article discusses Li Auto's strategic shift towards AI and intelligent driving, emphasizing the importance of AI in the company's long-term competitiveness and product development [3][10][12]. Group 1: AI Strategy and Development - Li Auto held a strategic meeting in October 2022, where the priority of AI-related business was adjusted, emphasizing the strategic importance of intelligent driving [3][5]. - The company aims to become a global leader in AI by 2030, with significant investments in AI talent and technology [5][10]. - Li Auto's AI assistant, "Li Xiang," has evolved from a car-mounted system to a multi-platform application, indicating a broader vision for AI applications beyond the vehicle [7][8]. Group 2: Intelligent Driving Focus - Intelligent driving was designated as the company's primary strategy in 2023, with plans to heavily invest in this area to compete with major players like Huawei [10][12]. - The company has expanded its intelligent driving team significantly, with over 50 new positions created in late 2023, reflecting a strong commitment to this technology [10][11]. - Li Auto is transitioning its intelligent driving technology from a modular approach to an "end-to-end" model, which is expected to enhance performance and user experience [11][12]. Group 3: Organizational Changes and AI Integration - An AI Technical Committee was established to integrate AI capabilities across various business lines, indicating a strategic focus on AI as a core business direction [14][15]. - The committee includes leaders from product development and research departments, ensuring that AI applications are aligned with the company's overall strategy [15][16]. - Li Auto's foundational model for AI is seen as a critical component for future developments, with aspirations to rank among the top three in the industry [17][18].
从 VLM 到 VLA,智驾距离跨过「L2.9999」还有多远?
机器之心· 2025-05-18 02:38
机器之心PRO · 会员通讯 Week 20 --- 本周为您解读 ② 个值得细品的 AI & Robotics 业内要事 --- 1. 从 VLM 到 VLA,智驾距离跨过「L2.9999」还有多远? 各大厂商智驾宣传坚持「卡」在 L2.999...有何玄机?端到端为何会成为主流叙事?车企在谈论端到端的时候,到底在谈论的是 什么?端到端智驾「说得比做得好」,存在哪些瓶颈?特斯拉为何被普遍认为领先市场?从 VLM 到 VLA,再到世界模型,自 动驾驶技术正在如何演进?... 2. 争夺 Agent 市场,微软的押宝点竟是「情商」? 是 情商将微软 AI 的核心竞争力?微软的办公 Agent 和 AI 伴侣如何区分?语音交互能否让 Copilot 脱颖而出,成功超越工具 属性?Suleyman 提出的「AI 个性工程」如何带来差异化优势?AI 算力的哪两个趋势正在发生?... 本期完整版通讯含 2 项专题解读 + 29 项 AI & Robotics 赛道要事速递,其中技术方面 12 项,国内方面 7 项,国外方面 10 项。 本期通讯总计 23569 字,可免费试读至 9% 消耗 99 微信豆即可兑换完整本期 ...
对话未来出行 | 商汤绝影CEO王晓刚:汽车是人工智能最好的载体,以世界模型和仿真学习突破特斯拉式数据壁垒
Mei Ri Jing Ji Xin Wen· 2025-05-16 04:00
Core Insights - The automotive industry is transitioning from hardware-focused competition to cognitive capabilities, with a shift towards "software-defined vehicles" and "cognitive reshaping of mobility" [1] - The evolution of smart cockpits is described in three stages: from a "Q&A tool" to an "all-around assistant," and finally to a "family member" with memory and empathy [1][8] - The penetration rate of L2-level assisted driving new cars in China reached 65% in Q1 2025, but challenges such as price wars and self-research trends among car manufacturers are emerging [1] Company Strategy - The company positions itself as an AI infrastructure and cloud service provider, deeply integrating with car manufacturers' data and R&D systems [3][19] - The focus is on the automotive sector as the strongest driver for AI development, leveraging multi-modal large models and world models to enhance capabilities [4][5] - The company aims to provide cloud services and foundational infrastructure for autonomous driving, shifting the R&D focus from vehicle-based to cloud-based solutions [22] Technology and Innovation - The company utilizes a combination of "world models + reinforcement learning" to overcome data limitations and reduce hardware dependency while ensuring system safety [1][10] - The approach to autonomous driving emphasizes simulation and reconstruction of failure scenarios to improve safety and model generalization [16] - The company believes that laser radar is a temporary requirement and can be replaced as model algorithms and data iterations improve [12][13] Collaboration with Automakers - The relationship with automakers is described as a "Tai Chi" model, emphasizing mutual dependence and collaboration rather than a clear-cut supplier-client dynamic [3][18] - The company has already integrated its products into seven vehicle models and plans to expand its offerings with more affordable solutions [17] - Data ownership remains with car manufacturers, and the company ensures data privacy through desensitization techniques [21] Future Outlook - The company aims to lead in the rapidly evolving field of general artificial intelligence, enhancing user experiences in the automotive sector over the next 3 to 5 years [24] - The focus will be on developing a platform that supports the AI ecosystem, ensuring that advanced technologies find suitable applications and feedback loops [24]
TMT行业月报:阿里巴巴扩大AI投资;VAL模型或将改变智能驾驶竞争格局
HONGTA SECURITIES· 2025-03-06 12:12
Investment Rating - The investment rating for the communication industry is "Outperform the Market" [1]. Core Insights - The report highlights significant investments in AI infrastructure by leading companies, with Alibaba announcing a plan to invest 380 billion yuan (approximately 54.5 billion USD) over the next three years, which surpasses its total investment in the past decade [20][24]. - The AI computing power demand is rapidly increasing, with the domestic AI computing scale expected to reach 725.3 EFLOPS in 2024, a year-on-year growth of 74.1%, and projected to reach 2781.9 EFLOPS by 2028 [21][24]. - The report discusses the emergence of the Vision-Language-Action (VLA) model in the autonomous driving sector, which integrates visual input, language reasoning, and action output into a single framework, enhancing the performance of intelligent driving systems [26][30]. Summary by Sections 1. Market Review - From February 5 to February 28, 2025, the CSI 300 index rose by 1.91%, with the communication industry also increasing by 1.91%, while the computer industry surged by 16.31% [6][13]. - The communication sector experienced significant volatility, benefiting from operators' increased investment in computing power, leading to strong stock performance for companies like China Unicom and China Telecom [6][13]. 2. Communication Industry - Major companies are expanding their AI investments, with Tencent, Baidu, and Alibaba expected to increase their capital expenditures by 19.1% in 2025, reaching 15.42 billion USD [20][24]. - The report notes that the construction of intelligent computing centers is set to accelerate, with over 458 projects announced in the public bidding market for 2024 [24][25]. 3. Computer Industry - The VLA model represents a new direction in autonomous driving technology, improving the ability to process complex traffic scenarios and enhancing decision-making capabilities [26][30]. - The global autonomous driving market is projected to grow from 207.4 billion USD in 2024 to 273.8 billion USD in 2025, with the Chinese market expected to reach 399.3 billion yuan in 2024 [31][32].
晚点独家丨地平线组织再调整,苏箐将主导高阶智驾方案落地
晚点LatePost· 2025-01-18 15:28
以下文章来源于晚点Auto ,作者晚点团队 晚点Auto . 从制造到创造,从不可能到可能。《晚点LatePost》旗下汽车品牌。 "土星五号" 是地平线高阶智驾研发的项目代号,主要负责基于地平线征程 6P 研发高阶智驾方案 Horizon SuperDrive,团队负责人是苏箐。苏箐此前是华为车 BU 智能驾驶产品部部长,强于工程能力和团队管理,曾 带队完成华为第一代智能驾驶产品 ADS 1.0 的研发工作。他在 2022 年 10 月加入地平线。 本周一,地平线在上海举办了 "智驾科技畅想日" 活动,这也是苏箐 2022 年从华为离职后的首度公开亮相。 苏箐在活动上称,自动驾驶研发没有 "银子弹"。地平线只有通过打造坚实工程团队,比别人吃更多苦、掌握 更多经验、具有更加持之以恒的心态,及时吸收最新技术并理解技术的边界,才能让高阶智能驾驶的技术创 新逐步落地。 "高阶智驾方案的成败,将决定地平线在国内市场的话语权。" 文丨赵宇 编辑丨龚方毅 我们独家获悉,国产智能驾驶解决方案提供商地平线近期发生多项组织调整,主要涉及智能汽车事 业部、"土星五号" 项目团队等。调整完成后,地平线高阶智驾研发、工程团队的管理权 ...