MindVLA
Search documents
李想:特斯拉V14也用了VLA相同的技术
自动驾驶之心· 2025-10-19 23:32
编辑 | 理想TOP2 转自 | 李想: 特斯拉V14也用了VLA相同技术|25年10月18日B站图文版压缩版 点击下方 卡片 ,关注" 自动驾驶之心 "公众号 戳我-> 领取 自动驾驶近30个 方向 学习 路线 >>自动驾驶前沿信息获取 → 自动驾驶之心知识星球 本文只做学术分享,如有侵权,联系删文 压缩版: 视频共计21min24s,花了10min51s介绍对OpenAI定义的5阶段的理解,做了很多类比。认为OpenAI在AI应用/模型/规范定义都做得非常好。 聊天机器人 (Chatbots):背后是基座模型,功能是压缩人类已知的数字知识。好比人上学到大学毕业,打下知识基础。 推理者 (Reasoners):具备思维链,能进行连续性思考和任务,主要依赖SFT和RLHF训练。好比人读研或有师傅带,获得经验。 智能体 (Agents):AI真正开始工作,能使用工具完成长任务。这对AI的专业性和可靠性要求极高(需达到八九十分才合格),好比人胜任一个专业岗位。 创新者 (Innovators):为解决智能体专业性难题,通过出题和解题来进行强化训练。这需要世界模型和RLAIF(AI反馈强化学习)来模拟真实环境的训练 ...
李想: 特斯拉V14也用了VLA相同技术|25年10月18日B站图文版压缩版
理想TOP2· 2025-10-18 16:03
Core Viewpoint - The article discusses the five stages of artificial intelligence (AI) as defined by OpenAI, emphasizing the importance of each stage in the development and application of AI technologies [10][11]. Group 1: Stages of AI - The first stage is Chatbots, which serve as a foundational model that compresses human knowledge, akin to a person completing their education [2][14]. - The second stage is Reasoners, which utilize supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF) to perform continuous reasoning tasks, similar to advanced academic training [3][16]. - The third stage is Agents, where AI begins to perform tasks autonomously, requiring a high level of reliability and professionalism, comparable to a person in a specialized job [4][17]. - The fourth stage is Innovators, focusing on generating and solving problems through reinforcement training, necessitating a world model for effective training [5][19]. - The fifth stage is Organizations, which manage multiple agents and innovations to prevent chaos, similar to corporate management [4][21]. Group 2: Computational Needs - The demand for reasoning computational power is expected to increase by 100 times, while training computational needs may expand by 10 times over the next five years [7][23]. - The article highlights the necessity for both edge and cloud computing to support the various stages of AI development, particularly in the Agent and Innovator phases [6][22]. Group 3: Ideal Self-Developed Technologies - The company is developing its own reasoning models (MindVLA/MindGPT), agents (Driver Agent/Ideal Classmate Agent), and world models to enhance its AI capabilities [8][24]. - By 2026, the company plans to equip its autonomous driving technology with self-developed advanced edge chips for deeper integration with AI [9][26]. Group 4: Training and Skill Development - The article emphasizes the importance of training in three key areas: information processing ability, problem formulation and solving ability, and resource allocation ability [33][36]. - It suggests that effective training requires real-world experience and feedback, akin to the 10,000-hour rule for mastering a profession [29][30].
理想基座模型负责人近期很满意的工作: RuscaRL
理想TOP2· 2025-10-03 09:55
Core Viewpoint - The article discusses the importance of reinforcement learning (RL) in enhancing the intelligence of large models, emphasizing the need for effective interaction between models and their environments to obtain high-quality feedback [1][2]. Summary by Sections Section 1: Importance of Reinforcement Learning - The article highlights that RL is crucial for the advancement of large model intelligence, with a focus on how to enable models to interact with broader environments to achieve capability generalization [1][8]. - It mentions various RL techniques such as RLHF (Reinforcement Learning from Human Feedback), RLAIF (AI Feedback Reinforcement Learning), and RLVR (Verifiable Reward Reinforcement Learning) as key areas of exploration [1][8]. Section 2: RuscaRL Framework - The RuscaRL framework is introduced as a solution to the exploration bottleneck in RL, utilizing educational psychology's scaffolding theory to enhance the reasoning capabilities of large language models (LLMs) [12][13]. - The framework employs explicit scaffolding and verifiable rewards to guide model training and improve response quality [13][15]. Section 3: Mechanisms of RuscaRL - **Explicit Scaffolding**: This mechanism provides structured guidance through rubrics, helping models generate diverse and high-quality responses while gradually reducing external support as the model's capabilities improve [14]. - **Verifiable Rewards**: RuscaRL designs rewards based on rubrics, allowing for stable and reliable feedback during training, which enhances exploration diversity and ensures knowledge consistency across tasks [15][16]. Section 4: Future Implications - The article suggests that both MindGPT and MindVLA, which target digital and physical worlds respectively, could benefit from the advancements made through RuscaRL, indicating a promising future for self-evolving models [9][10]. - It emphasizes that the current challenges in RL are not just algorithmic but also involve systemic integration of algorithms and infrastructure, highlighting the need for innovative approaches in building capabilities [9].
理想汽车MoE+Sparse Attention高效结构解析
自动驾驶之心· 2025-08-26 23:32
Core Viewpoint - The article discusses the advanced technologies used in Li Auto's autonomous driving solutions, specifically focusing on the "MoE + Sparse Attention" efficient structure that enhances the performance and efficiency of large models in 3D spatial understanding and reasoning [3][6]. Group 1: Introduction to Technologies - The article introduces a series of posts that delve deeper into the advanced technologies involved in Li Auto's VLM and VLA solutions, which were only briefly discussed in previous articles [3]. - The focus is on the "MoE + Sparse Attention" structure, which is crucial for improving the efficiency and performance of large models [3][6]. Group 2: Sparse Attention - Sparse Attention limits the complexity of the attention mechanism by focusing only on key input parts, rather than computing globally, which is particularly beneficial in 3D scenarios [6][10]. - The structure combines local attention and strided attention to create a sparse yet effective attention mechanism, ensuring that each token can quickly propagate information while maintaining local modeling capabilities [10][11]. Group 3: MoE (Mixture of Experts) - MoE architecture divides computations into multiple expert sub-networks, allowing only a subset of experts to be activated for each input, thus enhancing computational efficiency without significantly increasing inference costs [22][24]. - The article outlines the core components of MoE, including the Gate module for selecting experts, the Experts module as independent networks, and the Dispatcher for optimizing computation [24][25]. Group 4: Implementation and Communication - The article provides insights into the implementation of MoE using DeepSpeed, highlighting its flexibility and efficiency in handling large models [27][29]. - It discusses the communication mechanisms required for efficient data distribution across multiple GPUs, emphasizing the importance of the all-to-all communication strategy in distributed training [34][37].
理想i8,撑得起李想的“纯电梦”吗?
Xin Lang Cai Jing· 2025-08-02 01:34
Core Viewpoint - The launch of Li Auto's second pure electric model, the Li i8, represents a significant step in the company's pursuit of its "pure electric dream," with a focus on enhanced performance and advanced technology [1][3]. Group 1: Product Launch and Features - The Li i8 is officially on sale as of July 29, with three versions priced between 321,800 yuan and 369,800 yuan, approximately 30,000 yuan lower than the previous pre-sale price [1][3]. - The i8 features longer pure electric range, lower drag coefficient, and the introduction of the MindVLA autonomous driving architecture, which has been in development for years [3][14]. - The i8's dimensions are 5085mm in length, 1960mm in width, and 1740mm in height, with a wheelbase of 3050mm, providing spacious interior comfort [9][11]. Group 2: Competitive Landscape - The i8 enters a competitive market segment for six-seat pure electric SUVs, facing rivals such as the Aito M8, Leapmotor L90, and Tesla Model Y L [4][29]. - The pricing strategy of the i8 is not aggressive, which means it must rely on its overall strength to attract consumers [4][30]. - The market for pure electric models priced above 300,000 yuan is limited, with less than 80,000 units sold in the first four months of 2025, indicating a challenging environment for the i8 [29][30]. Group 3: Strategic Adjustments and Organizational Changes - Following the underperformance of the MEGA model, Li Auto made significant organizational adjustments, merging sales and service teams into a new smart vehicle group to enhance product development [3][21]. - The company has invested approximately 2 billion yuan in design changes for the i8, emphasizing low drag and brand recognition [25][27]. - Li Auto's internal discussions led to a clearer product line strategy, distinguishing the i series from the MEGA brand and focusing on the pure electric SUV market [21][25]. Group 4: Technological Innovations - The i8 is equipped with a self-developed silicon carbide drive motor, achieving a noise level of just 3.5 decibels at high speeds [14]. - The vehicle's dual motor system delivers a combined power of 400 kW (approximately 544 horsepower) and a maximum torque of 660 Nm, with a 0-100 km/h acceleration time of 4.5 seconds [14][15]. - The MindVLA system, a new visual-language-behavior model, allows the i8 to adapt to driving conditions in real-time, enhancing the driving experience [16][18].
竞争趋于白热化 六座纯电SUV争雄赛开打
Zheng Quan Shi Bao Wang· 2025-07-23 03:29
Core Insights - The competition among six-seat pure electric SUVs is intensifying, with models like AITO M8, Tesla Model Y L, and Li Auto i8 showcasing unique selling points to capture market share [1][2][3] Group 1: Technology and Features - AITO M8 features the latest HUAWEI ADS4 intelligent driving system, equipped with advanced sensors including a 192-line LiDAR and multiple radar systems, enhancing safety and driving assistance [1] - Tesla Model Y L is recognized for its Autopilot system, which offers extensive driving assistance features, although it faces challenges in fully utilizing its hardware in the domestic market [1][2] - Li Auto i8 is expected to incorporate the next-generation MindVLA driving architecture and NVIDIA's Drive AGX Thor-U chip for advanced data processing and decision-making [2] Group 2: Space and Comfort - AITO M8 offers a spacious design with dimensions of 5190/1999/1795mm and a wheelbase of 3105mm, providing both five-seat and six-seat configurations, along with a 110L front trunk for added convenience [3] - Tesla Model Y L emphasizes minimalist design with a large storage compartment in the center console, facilitating organized storage [3] - Li Auto i8 optimizes space through chassis layout and a multi-layer trunk design, ensuring ample legroom and storage options [3] Group 3: Performance and Range - AITO M8 is built on Huawei's 800V high-voltage battery platform, featuring a 100 kWh battery from CATL, with a maximum CLTC range of 705 km and efficient charging capabilities [3] - Tesla Model Y L offers various range options across different versions, supported by an extensive charging network for both urban commuting and long-distance travel [4]
VLA的Action到底是个啥?谈谈Diffusion:从图像生成到端到端轨迹规划~
自动驾驶之心· 2025-07-19 10:19
Core Viewpoint - The article discusses the principles and applications of diffusion models in the context of autonomous driving, highlighting their advantages over generative adversarial networks (GANs) and detailing specific use cases in the industry. Group 1: Diffusion Model Principles - Diffusion models are generative models that focus on denoising, learning and simulating data distributions through a forward diffusion process and a reverse generation process [2][4]. - The forward diffusion process adds noise to the initial data distribution, while the reverse generation process aims to remove noise to recover the original data [5][6]. - The models typically utilize a Markov chain to describe the state transitions during the noise addition and removal processes [8]. Group 2: Comparison with Generative Adversarial Networks - Both diffusion models and GANs involve noise addition and removal processes, but they differ in their core mechanisms: diffusion models rely on probabilistic modeling, while GANs use adversarial training between a generator and a discriminator [20][27]. - Diffusion models are generally more stable during training and produce higher quality samples, especially at high resolutions, compared to GANs, which can suffer from mode collapse and require training multiple networks [27][28]. Group 3: Applications in Autonomous Driving - Diffusion models are applied in various areas of autonomous driving, including synthetic data generation, scene prediction, perception enhancement, and path planning [29]. - They can generate realistic driving scene data to address the challenges of data scarcity and high annotation costs, particularly for rare scenarios like extreme weather [30][31]. - In scene prediction, diffusion models can forecast dynamic changes in driving environments and generate potential behaviors of traffic participants [33]. - For perception tasks, diffusion models enhance data quality by denoising bird's-eye view (BEV) images and improving sensor data consistency [34][35]. - In path planning, diffusion models support multimodal path generation, enhancing safety and adaptability in complex driving conditions [36]. Group 4: Notable Industry Implementations - Companies like Haomo Technology and Horizon Robotics are developing advanced algorithms based on diffusion models for real-world applications, achieving state-of-the-art performance in various driving scenarios [47][48]. - The integration of diffusion models with large language models (LLMs) and other technologies is expected to drive further innovations in the autonomous driving sector [46].
汽车行业4月投资策略:加征关税或重塑汽车产业链,关注上海车展和财报行情【国信汽车】
车中旭霞· 2025-04-10 14:48
重要行业新闻 1、行业动态 美国对多国征加高额关税,我国已经公布反制关税和非关税壁垒的组合措施 美国总统特朗普3月26日在白宫签署公告,宣布对进口汽车加征25%关税。这一关税措施于4月3日正式生效。符合美加墨协定的汽车零部件暂豁免,直至专门针对此类零部件产 品非美国价值部分征收关税程序出台;4月2日特朗普兑现在美国白宫签署两项关于"对等关税"的行政命令,在这份清单中,中国产品将被加征34%额外关税。叠加此前针对芬 太尼的20%,税率已经涨到至54%。与此同时,包括我国在内的其他国家也在积极制定和调整相应的应对策略。 核心观点 月度产销: 据乘联会初步统计,3月狭义乘用车零售总市场规模约为185.0万辆左右,同比+9.1%,环比+33.7%,其中新能源零售预计可达100万,渗透率回升至54.1%;上险数 据看,3月(3.3-3.30)国内乘用车累计上牌168.01万辆,同比+15.0%,新能源乘用车上牌88.78万辆,同比+32.8%;批发数据看,2月汽车产销210.3和212.9万辆,产销量环 比-14.1%和-12.2%,同比+39.6%和34.4%;新能源汽车产销完成88.8万辆和89.2万辆,同比+91 ...
VLA是特斯拉V13的对手吗?
36氪· 2025-04-08 11:05
Core Viewpoint - The entry of Tesla's Full Self-Driving (FSD) technology into the Chinese market has created a sense of urgency and anxiety among domestic autonomous driving companies, as they fear the potential competitive threat posed by Tesla's advanced AI capabilities [1][5][24]. Summary by Sections Tesla FSD Performance - Tesla's FSD has shown a mixed performance in China, with instances of both impressive driving capabilities and significant errors, highlighting the challenges of adapting to the complex driving environment in China [2][4]. - The underlying AI technology of Tesla is robust, allowing for smooth driving experiences in regular conditions, but it struggles with unique Chinese traffic scenarios due to a lack of localized data training [4][5]. VLA Model Introduction - The VLA model has emerged as a promising solution to the shortcomings of the end-to-end model, integrating visual, linguistic, and action capabilities to enhance vehicle understanding of complex driving situations [8][9]. - VLA's ability to interpret traffic signs and pedestrian intentions positions it as a potential game-changer in the autonomous driving landscape, especially if it can effectively address the unique challenges of Chinese roads [8][12]. Competitive Landscape - Four key players in the domestic market are actively developing VLA technology: Li Auto, Chery, Geely, and Yuanrong Qixing, each with distinct strategies and timelines for implementation [15][16]. - Li Auto's "MindVLA" aims for high accuracy in complex scenarios but faces challenges in managing dual systems, while Chery collaborates with major tech firms to enhance its capabilities [18][19]. - Yuanrong Qixing stands out for its aggressive development and production of VLA technology, positioning itself ahead of competitors in the market [19][21]. Future Outlook - The competition in the autonomous driving sector is shifting from engineering capabilities to the foundational AI model capabilities, with the upcoming deployment of VLA-equipped vehicles expected to provide clarity on the competitive dynamics between Tesla's FSD and domestic technologies [24][25].
理想汽车(02015) - 自愿公告 2025年3月交付更新资料
2025-04-01 08:30
香港交易及結算所有限公司及香港聯合交易所有限公司對本公告的內容概不負責,對其準確性 或完整性亦不發表任何聲明,並明確表示概不會就本公告全部或任何部分內容而產生或因倚賴 該等內容而引致的任何損失承擔任何責任。 Li Auto Inc. 理想汽車 (於開曼群島註冊成立以不同投票權控制的有限責任公司) (股份代號:2015) 自願公告 2025年3月交付更新資料 於2025年4月1日,中國新能源汽車市場的領導者理想汽車(「理想汽車」或「本 公司」)(納斯達克:LI;香港交易所:2015)宣佈,2025年3月,理想汽車交付 新車36,674輛,同比增長26.5%。2025年第一季度共計交付92,864輛,同比增長 15.5%。截至2025年3月31日,理想汽車歷史累計交付量為1,226,736輛。 在20萬元以上新能源汽車市場,理想汽車連續12個月獲得中國汽車品牌銷量冠 軍。作為理想汽車快速實現盈利、突破千億元營收的重要基石,理想L系列即 將迎來第100萬輛交付里程碑。理想MEGA Ultra智駕煥新版已開啟預訂,理想 MEGA車型在2025上海車展還將為大家帶來驚喜。本公司於3月宣佈將自研的汽 車操作系統—理想星環 ...