MindVLA
Search documents
L3自动驾驶量产元年,离L4的梦想又近了一步?
Xin Lang Cai Jing· 2025-12-17 06:30
文|极智GeeTech 近日,工信部首次批准L3级自动驾驶商业化运营,通过L3级自动驾驶准入申请的两款车型为长安深蓝SL03与极狐阿尔法S6,标志着我国首次允许车辆在特 定条件下由系统承担驾驶任务。可以预见的是,2026年将真正成为L3级自动驾驶的"量产元年"。 值得注意的是,此次明确了L3级自动驾驶的权责划分:当车辆在限定路段以不超过80公里时速自主行驶时,一旦发生事故,若系统处于激活状态,车企或 将承担主要责任。同时,准入要求L3级自动驾驶车辆的传感设备必须为"前装量产",后改装车辆无法获得试点资格,从源头保障技术稳定性。 行业普遍认为,L3级是从"辅助驾驶"到"完全自动驾驶"的重要过渡,后续的L4级自动驾驶将实现更大突破——在固定区域内,车辆可完全脱离人类干预,真 正实现无人驾驶。 这一小步,背后是全球十年的技术博弈。德国早在2021年就通过《自动驾驶法》,明确L3系统激活期间事故责任由车企承担,并要求车辆配备"黑匣子"记录 运行数据。奔驰Drive Pilot系统随后在德国高速公路上线,成为全球首个商业化的L3产品。相比之下,中国此次准入虽起步稍晚,却一步切入责任核心,未 走"测试"老路,而是直接启动 ...
以理想汽车为例,探寻自动驾驶的「大脑」进化史 - VLA 架构解析
自动驾驶之心· 2025-12-07 02:05
作者 | 我要吃鸡腿 编辑 | 自动驾驶之心 原文链接: https://zhuanlan.zhihu.com/p/1965839552158623077 点击下方 卡片 ,关注" 自动驾驶之心 "公众号 戳我-> 领取 自动驾驶近30个 方向 学习 路线 >>自动驾驶前沿信息获取 → 自动驾驶之心知识星球 本文只做学术分享,如有侵权,联系删文 在自动驾驶这个飞速迭代的领域,技术范式的更迭快得令人目不暇接。前年,行业言必称BEV(鸟瞰图视 角);去年,"端到端"(End-to-End)又成了新的技术高地。然而,每一种范式在解决旧问题的同时,似乎都 在催生新的挑战。 传统的"端到端"自动驾驶,即VA(Vision-Action,视觉-行动)模型,就暴露出一个深刻的矛盾:它就像一个 车技高超但沉默寡言的"老司机"。它能凭借海量数据训练出的"直觉",在复杂的路况中做出令人惊叹的丝滑操 作。但当您坐在副驾,心脏漏跳一拍后问它:"刚才为什么突然减速?"——它答不上来。 这就是"黑箱"问题:系统能"做对",但我们不知道它"为何做对"。这种无法解释、无法沟通的特性,带来了巨 大的信任危机。 自动驾驶的三大范式演进。(a) ...
李想:特斯拉V14也用了VLA相同的技术
自动驾驶之心· 2025-10-19 23:32
Core Insights - The article discusses the five stages of artificial intelligence (AI) as defined by OpenAI, emphasizing the importance of each stage in the development and application of AI technologies [17][18]. Group 1: Stages of AI Development - The first stage is Chatbots, which serve as a foundational model that compresses human knowledge, akin to a person completing their education [19][4]. - The second stage is Reasoners, which utilize supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF) to perform continuous reasoning tasks, similar to advanced academic training [20][21]. - The third stage is Agents, where AI begins to perform tasks autonomously, requiring a high level of professionalism and reliability, comparable to a person in a specialized job [22][23]. - The fourth stage is Innovators, focusing on the ability to generate and solve problems through real-world training and feedback, which is essential for enhancing the capabilities of AI [25][26]. - The fifth stage is Organizations, which manage multiple agents and innovations to prevent chaos, similar to how businesses manage human resources [27][28]. Group 2: Computational Needs - The demand for reasoning computational power is expected to increase by 100 times in the next five years, while training computational needs may expand by 10 times [10][29]. - The article highlights the necessity for both edge computing and cloud-based processing to support the various stages of AI development [28][29]. Group 3: Ideal Automotive Applications - The company is developing its own reasoning models (MindVLA/MindGPT) and agents (Driver Agent/Ideal Classmate Agent) to enhance its autonomous driving capabilities [31][33]. - By 2026, the company plans to equip its autonomous vehicles with self-developed advanced edge chips for deeper integration with AI [12][33]. Group 4: Training and Skill Development - Effective training for AI involves enhancing three key abilities: information processing, problem formulation and solving, and resource allocation [39][40][41]. - The article emphasizes that successful AI applications require extensive training, akin to the 10,000 hours of practice needed for mastery in a profession [36][42].
李想: 特斯拉V14也用了VLA相同技术|25年10月18日B站图文版压缩版
理想TOP2· 2025-10-18 16:03
Core Viewpoint - The article discusses the five stages of artificial intelligence (AI) as defined by OpenAI, emphasizing the importance of each stage in the development and application of AI technologies [10][11]. Group 1: Stages of AI - The first stage is Chatbots, which serve as a foundational model that compresses human knowledge, akin to a person completing their education [2][14]. - The second stage is Reasoners, which utilize supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF) to perform continuous reasoning tasks, similar to advanced academic training [3][16]. - The third stage is Agents, where AI begins to perform tasks autonomously, requiring a high level of reliability and professionalism, comparable to a person in a specialized job [4][17]. - The fourth stage is Innovators, focusing on generating and solving problems through reinforcement training, necessitating a world model for effective training [5][19]. - The fifth stage is Organizations, which manage multiple agents and innovations to prevent chaos, similar to corporate management [4][21]. Group 2: Computational Needs - The demand for reasoning computational power is expected to increase by 100 times, while training computational needs may expand by 10 times over the next five years [7][23]. - The article highlights the necessity for both edge and cloud computing to support the various stages of AI development, particularly in the Agent and Innovator phases [6][22]. Group 3: Ideal Self-Developed Technologies - The company is developing its own reasoning models (MindVLA/MindGPT), agents (Driver Agent/Ideal Classmate Agent), and world models to enhance its AI capabilities [8][24]. - By 2026, the company plans to equip its autonomous driving technology with self-developed advanced edge chips for deeper integration with AI [9][26]. Group 4: Training and Skill Development - The article emphasizes the importance of training in three key areas: information processing ability, problem formulation and solving ability, and resource allocation ability [33][36]. - It suggests that effective training requires real-world experience and feedback, akin to the 10,000-hour rule for mastering a profession [29][30].
理想基座模型负责人近期很满意的工作: RuscaRL
理想TOP2· 2025-10-03 09:55
Core Viewpoint - The article discusses the importance of reinforcement learning (RL) in enhancing the intelligence of large models, emphasizing the need for effective interaction between models and their environments to obtain high-quality feedback [1][2]. Summary by Sections Section 1: Importance of Reinforcement Learning - The article highlights that RL is crucial for the advancement of large model intelligence, with a focus on how to enable models to interact with broader environments to achieve capability generalization [1][8]. - It mentions various RL techniques such as RLHF (Reinforcement Learning from Human Feedback), RLAIF (AI Feedback Reinforcement Learning), and RLVR (Verifiable Reward Reinforcement Learning) as key areas of exploration [1][8]. Section 2: RuscaRL Framework - The RuscaRL framework is introduced as a solution to the exploration bottleneck in RL, utilizing educational psychology's scaffolding theory to enhance the reasoning capabilities of large language models (LLMs) [12][13]. - The framework employs explicit scaffolding and verifiable rewards to guide model training and improve response quality [13][15]. Section 3: Mechanisms of RuscaRL - **Explicit Scaffolding**: This mechanism provides structured guidance through rubrics, helping models generate diverse and high-quality responses while gradually reducing external support as the model's capabilities improve [14]. - **Verifiable Rewards**: RuscaRL designs rewards based on rubrics, allowing for stable and reliable feedback during training, which enhances exploration diversity and ensures knowledge consistency across tasks [15][16]. Section 4: Future Implications - The article suggests that both MindGPT and MindVLA, which target digital and physical worlds respectively, could benefit from the advancements made through RuscaRL, indicating a promising future for self-evolving models [9][10]. - It emphasizes that the current challenges in RL are not just algorithmic but also involve systemic integration of algorithms and infrastructure, highlighting the need for innovative approaches in building capabilities [9].
理想汽车MoE+Sparse Attention高效结构解析
自动驾驶之心· 2025-08-26 23:32
Core Viewpoint - The article discusses the advanced technologies used in Li Auto's autonomous driving solutions, specifically focusing on the "MoE + Sparse Attention" efficient structure that enhances the performance and efficiency of large models in 3D spatial understanding and reasoning [3][6]. Group 1: Introduction to Technologies - The article introduces a series of posts that delve deeper into the advanced technologies involved in Li Auto's VLM and VLA solutions, which were only briefly discussed in previous articles [3]. - The focus is on the "MoE + Sparse Attention" structure, which is crucial for improving the efficiency and performance of large models [3][6]. Group 2: Sparse Attention - Sparse Attention limits the complexity of the attention mechanism by focusing only on key input parts, rather than computing globally, which is particularly beneficial in 3D scenarios [6][10]. - The structure combines local attention and strided attention to create a sparse yet effective attention mechanism, ensuring that each token can quickly propagate information while maintaining local modeling capabilities [10][11]. Group 3: MoE (Mixture of Experts) - MoE architecture divides computations into multiple expert sub-networks, allowing only a subset of experts to be activated for each input, thus enhancing computational efficiency without significantly increasing inference costs [22][24]. - The article outlines the core components of MoE, including the Gate module for selecting experts, the Experts module as independent networks, and the Dispatcher for optimizing computation [24][25]. Group 4: Implementation and Communication - The article provides insights into the implementation of MoE using DeepSpeed, highlighting its flexibility and efficiency in handling large models [27][29]. - It discusses the communication mechanisms required for efficient data distribution across multiple GPUs, emphasizing the importance of the all-to-all communication strategy in distributed training [34][37].
理想i8,撑得起李想的“纯电梦”吗?
Xin Lang Cai Jing· 2025-08-02 01:34
Core Viewpoint - The launch of Li Auto's second pure electric model, the Li i8, represents a significant step in the company's pursuit of its "pure electric dream," with a focus on enhanced performance and advanced technology [1][3]. Group 1: Product Launch and Features - The Li i8 is officially on sale as of July 29, with three versions priced between 321,800 yuan and 369,800 yuan, approximately 30,000 yuan lower than the previous pre-sale price [1][3]. - The i8 features longer pure electric range, lower drag coefficient, and the introduction of the MindVLA autonomous driving architecture, which has been in development for years [3][14]. - The i8's dimensions are 5085mm in length, 1960mm in width, and 1740mm in height, with a wheelbase of 3050mm, providing spacious interior comfort [9][11]. Group 2: Competitive Landscape - The i8 enters a competitive market segment for six-seat pure electric SUVs, facing rivals such as the Aito M8, Leapmotor L90, and Tesla Model Y L [4][29]. - The pricing strategy of the i8 is not aggressive, which means it must rely on its overall strength to attract consumers [4][30]. - The market for pure electric models priced above 300,000 yuan is limited, with less than 80,000 units sold in the first four months of 2025, indicating a challenging environment for the i8 [29][30]. Group 3: Strategic Adjustments and Organizational Changes - Following the underperformance of the MEGA model, Li Auto made significant organizational adjustments, merging sales and service teams into a new smart vehicle group to enhance product development [3][21]. - The company has invested approximately 2 billion yuan in design changes for the i8, emphasizing low drag and brand recognition [25][27]. - Li Auto's internal discussions led to a clearer product line strategy, distinguishing the i series from the MEGA brand and focusing on the pure electric SUV market [21][25]. Group 4: Technological Innovations - The i8 is equipped with a self-developed silicon carbide drive motor, achieving a noise level of just 3.5 decibels at high speeds [14]. - The vehicle's dual motor system delivers a combined power of 400 kW (approximately 544 horsepower) and a maximum torque of 660 Nm, with a 0-100 km/h acceleration time of 4.5 seconds [14][15]. - The MindVLA system, a new visual-language-behavior model, allows the i8 to adapt to driving conditions in real-time, enhancing the driving experience [16][18].
竞争趋于白热化 六座纯电SUV争雄赛开打
Zheng Quan Shi Bao Wang· 2025-07-23 03:29
Core Insights - The competition among six-seat pure electric SUVs is intensifying, with models like AITO M8, Tesla Model Y L, and Li Auto i8 showcasing unique selling points to capture market share [1][2][3] Group 1: Technology and Features - AITO M8 features the latest HUAWEI ADS4 intelligent driving system, equipped with advanced sensors including a 192-line LiDAR and multiple radar systems, enhancing safety and driving assistance [1] - Tesla Model Y L is recognized for its Autopilot system, which offers extensive driving assistance features, although it faces challenges in fully utilizing its hardware in the domestic market [1][2] - Li Auto i8 is expected to incorporate the next-generation MindVLA driving architecture and NVIDIA's Drive AGX Thor-U chip for advanced data processing and decision-making [2] Group 2: Space and Comfort - AITO M8 offers a spacious design with dimensions of 5190/1999/1795mm and a wheelbase of 3105mm, providing both five-seat and six-seat configurations, along with a 110L front trunk for added convenience [3] - Tesla Model Y L emphasizes minimalist design with a large storage compartment in the center console, facilitating organized storage [3] - Li Auto i8 optimizes space through chassis layout and a multi-layer trunk design, ensuring ample legroom and storage options [3] Group 3: Performance and Range - AITO M8 is built on Huawei's 800V high-voltage battery platform, featuring a 100 kWh battery from CATL, with a maximum CLTC range of 705 km and efficient charging capabilities [3] - Tesla Model Y L offers various range options across different versions, supported by an extensive charging network for both urban commuting and long-distance travel [4]
VLA的Action到底是个啥?谈谈Diffusion:从图像生成到端到端轨迹规划~
自动驾驶之心· 2025-07-19 10:19
Core Viewpoint - The article discusses the principles and applications of diffusion models in the context of autonomous driving, highlighting their advantages over generative adversarial networks (GANs) and detailing specific use cases in the industry. Group 1: Diffusion Model Principles - Diffusion models are generative models that focus on denoising, learning and simulating data distributions through a forward diffusion process and a reverse generation process [2][4]. - The forward diffusion process adds noise to the initial data distribution, while the reverse generation process aims to remove noise to recover the original data [5][6]. - The models typically utilize a Markov chain to describe the state transitions during the noise addition and removal processes [8]. Group 2: Comparison with Generative Adversarial Networks - Both diffusion models and GANs involve noise addition and removal processes, but they differ in their core mechanisms: diffusion models rely on probabilistic modeling, while GANs use adversarial training between a generator and a discriminator [20][27]. - Diffusion models are generally more stable during training and produce higher quality samples, especially at high resolutions, compared to GANs, which can suffer from mode collapse and require training multiple networks [27][28]. Group 3: Applications in Autonomous Driving - Diffusion models are applied in various areas of autonomous driving, including synthetic data generation, scene prediction, perception enhancement, and path planning [29]. - They can generate realistic driving scene data to address the challenges of data scarcity and high annotation costs, particularly for rare scenarios like extreme weather [30][31]. - In scene prediction, diffusion models can forecast dynamic changes in driving environments and generate potential behaviors of traffic participants [33]. - For perception tasks, diffusion models enhance data quality by denoising bird's-eye view (BEV) images and improving sensor data consistency [34][35]. - In path planning, diffusion models support multimodal path generation, enhancing safety and adaptability in complex driving conditions [36]. Group 4: Notable Industry Implementations - Companies like Haomo Technology and Horizon Robotics are developing advanced algorithms based on diffusion models for real-world applications, achieving state-of-the-art performance in various driving scenarios [47][48]. - The integration of diffusion models with large language models (LLMs) and other technologies is expected to drive further innovations in the autonomous driving sector [46].
汽车行业4月投资策略:加征关税或重塑汽车产业链,关注上海车展和财报行情【国信汽车】
车中旭霞· 2025-04-10 14:48
重要行业新闻 1、行业动态 美国对多国征加高额关税,我国已经公布反制关税和非关税壁垒的组合措施 美国总统特朗普3月26日在白宫签署公告,宣布对进口汽车加征25%关税。这一关税措施于4月3日正式生效。符合美加墨协定的汽车零部件暂豁免,直至专门针对此类零部件产 品非美国价值部分征收关税程序出台;4月2日特朗普兑现在美国白宫签署两项关于"对等关税"的行政命令,在这份清单中,中国产品将被加征34%额外关税。叠加此前针对芬 太尼的20%,税率已经涨到至54%。与此同时,包括我国在内的其他国家也在积极制定和调整相应的应对策略。 核心观点 月度产销: 据乘联会初步统计,3月狭义乘用车零售总市场规模约为185.0万辆左右,同比+9.1%,环比+33.7%,其中新能源零售预计可达100万,渗透率回升至54.1%;上险数 据看,3月(3.3-3.30)国内乘用车累计上牌168.01万辆,同比+15.0%,新能源乘用车上牌88.78万辆,同比+32.8%;批发数据看,2月汽车产销210.3和212.9万辆,产销量环 比-14.1%和-12.2%,同比+39.6%和34.4%;新能源汽车产销完成88.8万辆和89.2万辆,同比+91 ...