Workflow
理想TOP2
icon
Search documents
理想DrivingScene: 两帧图像实时重建动态驾驶场景
理想TOP2· 2025-11-02 09:08
Research Background and Challenges - The safety and reliability of autonomous driving systems heavily depend on 4D dynamic scene reconstruction, which includes real-time, high-fidelity environmental perception in 3D space plus the time dimension. The industry faces two core contradictions: the limitations of static feedforward solutions, which assume "no dynamics in the scene," leading to severe artifacts when encountering moving targets like vehicles and pedestrians, making them unsuitable for real driving scenarios [1]. Core Innovations - Harbin Institute of Technology, in collaboration with Li Auto and other research teams, has achieved three key design breakthroughs to unify "real-time performance, high fidelity, and multi-task output" [2]. Related Work Overview - Static driving scene reconstruction methods include DrivingForward, pixelSplat, MVSplat, and DepthSplat, which have shown limitations in adapting to dynamic environments [3]. Key Technical Solutions - A two-stage training paradigm is proposed, where a robust static scene prior is learned from large-scale data before training the dynamic module, addressing the instability of end-to-end training and reducing the complexity of dynamic modeling [4]. - A hybrid shared architecture with a residual flow network is designed, featuring a shared depth encoder and a single-camera decoder to predict only the non-rigid motion residuals of dynamic objects, ensuring cross-view scale consistency and computational efficiency [4]. - A pure visual online feedforward framework is introduced, which inputs two consecutive panoramic images to output 3D Gaussian point clouds, depth maps, and scene flows in real-time, meeting the online perception needs of autonomous driving without offline optimization or multi-modal sensors [4]. Experimental Validation and Results Analysis - The method significantly outperforms existing feedforward baselines in quantitative results, achieving a PSNR of 28.76, which is 2.66 dB higher than Driv3R and 2.7 dB higher than DrivingForward, and an SSIM of 0.895, indicating superior rendering fidelity [28]. - The efficiency analysis shows that the proposed method has a faster inference time of 0.21 seconds per frame, which is 38% faster than DrivingForward and 70% faster than Driv3R, with a training cost of approximately 5 days and VRAM usage of 27.3 GB, significantly lower than Driv3R [30]. - Ablation studies confirm the necessity of the residual flow network, two-stage training, and flow distortion loss, highlighting their critical roles in dynamic modeling and rendering quality [32][34].
和一些人交流后, 更深入的分析地平线HSD与理想VLA
理想TOP2· 2025-11-02 09:08
Core Viewpoints - The article presents eight key viewpoints regarding the performance and evaluation of autonomous driving technologies, particularly focusing on the experiences with Horizon's HSD and Li Auto's VLA systems [2]. Group 1: Performance Evaluation - TOP2 found the Horizon HSD software experience during a 1.5-hour test drive in Hangzhou to be significantly better than the current production version of Li Auto's L7 VLA [2]. - There is a possibility that the production version of Horizon's software may not perform as well as the engineering version experienced during the test [2]. - The evaluation of autonomous driving systems is limited by the number of test experiences, as a few tests cannot generalize performance across different regions [3]. Group 2: Technical Architecture - Horizon employs a VA-style end-to-end system, while Li Auto uses a VLA-style end-to-end system, with the naming being a minor distinction [3][9]. - The current technological landscape suggests that VA-style systems may have advantages in user experience due to existing computational and bandwidth limitations [6]. - Li Auto's decision to adopt a VLA-style system is seen as a courageous move, as it requires significant resources and presents various challenges [14]. Group 3: Market Dynamics - The future landscape of autonomous driving operators is uncertain, with a prevailing belief that only a few companies will survive, particularly those capable of self-developing their technologies [4]. - Companies lacking self-research capabilities in autonomous driving may struggle to adapt to the evolving smart vehicle industry [4]. - The article emphasizes that autonomous driving is not merely a selling point but a differentiating capability that can lead to high market concentration due to low marginal costs [4]. Group 4: User Experience Insights - Feedback from Horizon personnel indicated that the performance of their systems in extreme weather and complex scenarios is generally average, highlighting the need for comprehensive testing [5][6]. - The experiences reported during the test drives varied significantly based on the vehicle models and their respective chip capabilities, indicating that performance can be inconsistent [7]. - The article suggests that the perception of Horizon's HSD performance may be overly positive due to selective testing locations and conditions [8].
如何做出MEGA召回决定更多的细节
理想TOP2· 2025-11-01 04:42
Core Viewpoint - The company acknowledges a significant incident involving battery thermal runaway and emphasizes the importance of safety and proactive measures in vehicle management [2][5]. Incident Analysis - The company has delivered over 1.4 million vehicles without any thermal runaway incidents due to external factors, attributing this to robust quality control and an AI-based quality warning system [2]. - Prior to the incident, the cloud system reported a battery insulation fault over four hours before the event, and the vehicle had entered a breakdown state due to battery issues [3]. - The failure to take immediate action despite the warnings is attributed to complacency, as the company had not previously encountered such issues [3]. Cause of the Incident - The root cause of the insulation short circuit was determined not to be the battery cells themselves, but rather corrosion of the aluminum plate due to inadequate coolant protection [4]. - The company recognizes the need for zero tolerance regarding safety risks, even if they are perceived as low probability [4]. Recall Decision - A consensus was reached among company leaders to initiate a recall to replace affected components, prioritizing safety over cost considerations [5]. - The recall process was expedited, with preparations for new battery and motor controller production underway [5][6]. Production Capacity Challenges - The current production capacity for batteries is 3,300 units per month, necessitating suppliers to ramp up production capabilities for the recall [6]. Leadership Involvement - Notably, the company's founder did not participate in the recall decision meetings, indicating a unified commitment to safety among the leadership team [6]. User Communication - The company expresses sincere apologies to users affected by the incident, acknowledging the inconvenience caused [8].
对理想25年10月交付31767辆的分析
理想TOP2· 2025-11-01 04:42
Core Insights - The delivery figure of 31,767 units for October 2025 is considered low, with specific model expectations to be clarified by November 10 [1] - The i8 model faces production capacity issues due to low configuration selection rates, which are only around 2% [2] - The L series orders are underperforming, attributed to various hypotheses including competition, product iteration speed, and economic conditions [7] Group 1: Delivery and Production Issues - The October 2025 delivery number is low, with expectations for model-specific data to be available later [1] - The i8's production capacity is constrained by its low configuration selection rate, which is significantly lower than other models [2][5] - The i6 model will not be delivered with the Xinwanda battery version in 2025, further complicating production capacity issues [6] Group 2: Model Configuration and Market Dynamics - The configuration distribution for various models shows significant differences, with the i8 having a much lower low configuration rate compared to the L series [5] - The L series model distribution indicates a varied preference among consumers, with specific configurations being more popular [3] - The underperformance of L series orders may be linked to multiple factors, including competitive pressures and market conditions [7] Group 3: Future Expectations and Strategic Decisions - There is speculation that the i6's order volume may exceed expectations, suggesting potential adjustments in production strategy [6] - The company may consider a joint venture with Xinwanda for battery production to address future supply chain challenges [6] - The overall sentiment indicates a need for improved product strength and value communication to enhance market performance [7]
价值观让理想选择召回2024款MEGA以及对应的处理风格
理想TOP2· 2025-10-31 09:31
2025年10月31日理想微博表示: 与此同时,我们也立即展开内部调查与分析,并对云端预警系统记录和专项验证数据进行了复核。结 果显示,与事故车同批次的理想MEGA 2024款车辆中,由于该批次冷却液防腐性能不足,特定条件 下会导致冷却回路中动力电池和前电机控制器的冷却铝板腐蚀渗漏,导致车辆出现故障灯点亮、动力 受限及无法上电的情形,极端情况下会造成动力电池热失控,存在安全隐患。 2025年10月23日晚,上海发生了一起理想MEGA 2024款车辆起火事件,引发用户、媒体和社会的密 切关注。在此,我们首先向车主表示诚挚的歉意,并对广大用户的担忧和关切表示理解。 事件发生后,我们第一时间与车主取得联系,积极配合相关部门开展调查工作。由于事故车辆需要用 户、消防及相关机构共同完成勘验与检测, 这一过程必须遵循严格的程序,耗时较长。截至目前, 尚未形成最终的技术结论。 安全始终摆在理想汽车的首位。 本着对用户安全高度负责、对潜在隐患零容忍的原则, 我们已主动 向国家市场监督管理总局备案召回计划, 对事故车同批次所有的理想MEGA 2024款车辆进行安全检 测与更换维修。我们将全力以赴排查并消除每一处风险,确保隐患清 ...
李想聊如何看待理想被当作汽车公司估值
理想TOP2· 2025-10-30 06:34
Core Viewpoint - The company is positioned as an artificial intelligence terminal company, but its performance is still heavily reliant on vehicle sales, which raises questions about the correlation between AI development and sales performance [1][2]. Group 1: AI Development and Business Model - The company aims to achieve Level 4 autonomous driving, emphasizing that the true value of AI will be realized when users can engage in other activities during their commutes [1]. - The company is exploring the potential to generate $100 billion in revenue with a significantly reduced workforce, indicating that successful AI implementation could validate its strategic direction [2]. Group 2: Product and Technology Strategy - The company is diversifying its operations by developing operating systems, chips, and foundational models, similar to Apple's early product ecosystem, suggesting that such expansion is reasonable given its revenue scale [2]. - The company believes that investments in technology will lead to substantial cost savings, making the diversification strategy financially beneficial [2]. Group 3: Risk Factors and Organizational Capability - The company identifies three critical factors that could lead to its failure: failure to understand user needs, lack of superior products and technology, and significant organizational capability issues [3]. - These factors are interrelated, and a comprehensive assessment of all three is necessary for effective risk management and strategic planning [3].
理想詹锟ICCV'25讲世界模型从数据闭环到训练闭环PPT
理想TOP2· 2025-10-28 15:18
Core Insights - The article discusses the evolution of autonomous driving technology, emphasizing the transition from data closed-loop systems to training closed-loop systems, which focus on real-world utility and evaluation of progress [13][14]. Group 1: Data and Infrastructure - The company has accumulated 1.5 billion kilometers of driving data, which is crucial for training autonomous systems [8]. - A closed-loop data system is in place, utilizing over 200 trigger data points for training datasets, with clips ranging from 15 to 45 seconds [8]. - The data scaling law indicates a significant increase in the number of clips used for training, with projections showing up to 600 million clips by 2025 [10]. Group 2: Technology Stack - The key technology stack for autonomous driving includes regional-scale simulation, synthetic data, reinforcement learning, and multimodal generation [18]. - The focus is on enhancing simulation quality through advanced techniques like scene reconstruction and traffic agent modeling [18][19]. - The transition from reconstruction to generation in simulation is highlighted, utilizing diffusion models for improved scene generation [19]. Group 3: Training and Evaluation - The article emphasizes the importance of building a training closed-loop that integrates various models, including VLA (Vision-Language Alignment) and reinforcement learning [15]. - The evaluation environment and reward systems are critical for assessing the performance of autonomous driving systems [14][35]. - Interactive agents are identified as a key challenge in the training closed-loop, necessitating accurate feedback and generalization ability [38][40]. Group 4: Future Directions - The company is working on various projects aimed at enhancing both reconstruction and generation capabilities, with milestones set for 2024 and 2025 [21][24]. - The application of generated data includes scene editing, scene transfer, and scene generation, which are essential for improving the realism of simulations [27][33].
地平线HSD的确值得理想留意
理想TOP2· 2025-10-27 13:50
Core Viewpoint - The article discusses the comparative performance of Horizon's HSD technology and Li Auto's VLA system, highlighting the strengths and weaknesses of both in terms of autonomous driving capabilities and user experience [1][2]. Group 1: Performance Comparison - Horizon's HSD engineering vehicle demonstrated superior auxiliary driving capabilities compared to Li Auto's L7 VLA as of October 2025, although there is a possibility that mass production vehicles may not perform as well as engineering prototypes [1]. - During a 1.5-hour test drive around West Lake in Hangzhou, the HSD vehicle showed high levels of comfort and smoothness, with no need for manual speed adjustments, contrasting with the frequent adjustments required in the Li Auto VLA [2]. - Feedback from multiple testers indicated that the A model of Horizon's HSD performed well, while the B model was considered average, attributed to differences in chip computing power and collaboration between the two companies [2]. Group 2: Limitations and Challenges - Horizon's team acknowledged that the HSD system performs poorly in extreme weather, non-standard scenarios, and complex situations, indicating that it is not yet fully reliable for autonomous driving [3]. - The team also noted that the transition from auxiliary driving to full autonomy can sometimes lead to subpar experiences, particularly in scenarios requiring navigation adjustments [3]. - The integration of HUD and vehicle interfaces is crucial for the overall driving experience, with some design choices being counterintuitive, which could affect user satisfaction [3]. Group 3: Community Engagement - There is an invitation for deeper discussions regarding Li Auto's operational status and long-term fundamentals, emphasizing a focus on practical business insights rather than technical discussions [4].
理想对打破部门墙是如何思考的?
理想TOP2· 2025-10-26 10:06
Core Viewpoint - The article discusses the evolution of collaboration between departments within the company, emphasizing the transition from isolated data handling to a shared data language and co-creation, ultimately leading to a more efficient and integrated approach to problem-solving and product development [4][5][10]. Group 1: Challenges of Departmental Silos - Departmental silos create barriers that hinder effective communication and collaboration, leading to conflicts in objectives and a lack of a unified approach to problem-solving [3]. - The division of responsibilities among departments, while enhancing specialization, results in a fragmented view of issues, making it difficult to establish a cross-departmental mechanism for addressing problems [3]. Group 2: Initial Collaboration and Data Sharing - The initial collaboration between the Ideal Lianshan team and the thermal management team began with addressing poor cloud signal data quality, leading to the development of a common analytical framework [4]. - The shift from a "data request-result" model to a shared data language allowed both teams to engage in meaningful dialogue using the same data and metrics [4][5]. Group 3: Evolution of Collaborative Methods - The collaboration evolved from merely sharing data to co-creating solutions, focusing on common goals and fostering trust through transparency [5][6]. - The implementation of automated testing processes helped alleviate the burdens faced by engineers during extreme conditions, showcasing the practical benefits of this collaborative approach [5]. Group 4: Productization of Collaboration - Over three years, the company expanded its collaborative model to include supply chain and production line processes, developing AI-driven solutions to intercept quality issues at the source [9]. - The establishment of a standardized, replicable methodology for data science projects has transformed the collaboration into a sustainable and scalable productized approach [10]. Group 5: Achievements and Future Aspirations - The company has accumulated significant achievements, including 83 data science projects, 3545 warning models, and extensive monitoring capabilities across production lines and suppliers [10]. - The goal is to promote this collaborative model further, enabling seamless cooperation among individuals, AI, and across departments to address real business challenges [11].
VLA/世界模型/WA/端到端是宣传分歧, 不是技术路线分歧
理想TOP2· 2025-10-25 05:21
Core Viewpoints - Many people are unaware that there is no universally accepted definition of VLA/world model/end-to-end [1] - Leading autonomous driving companies share more commonalities in their exploration of autonomous driving than the differences portrayed online, with the core being promotional divergence rather than technical route divergence [1][2] - Language plays a significant role in autonomous driving, particularly in long reasoning, user interaction value alignment, and understanding the world [1] - Those who believe that predicting the next token is more than just a probability distribution are more likely to accept that language can understand the world [1] Group 1: VLA/World Model/End-to-End - VLA, world model, and end-to-end all require the ability to generate road video data that appears real, focusing on visual information input and ultimately controlling vehicle actions [2] - The distinction lies in the involvement of language, its depth of participation, and the architectural form it takes, with future language-related tokens potentially being LLM's text tokens or photon tokens [2] - The narrative that VLA and world models represent different technical routes is misleading, as both need to generate a world model and understand the physical world [4] Group 2: End-to-End Definitions - The definition of end-to-end is often debated, with some believing it requires a core framework where input and output are clearly defined [5] - Tesla's approach, which involves visual input and outputting trajectory rather than direct control signals, raises questions about the true nature of their end-to-end definition [5][6] - The output of precise trajectories is preferred over direct control signals, suggesting a more effective design approach [6] Group 3: Tesla's Approach and Future Directions - Tesla's historical context and style suggest that their approach to end-to-end definitions may not have a universally accepted exclusivity [7] - Long-term predictions indicate that AI model inputs and outputs may predominantly involve photons, which could significantly reduce computational loads [10] - The ideal VLA model is defined as having visual or multimodal input, language participation, and ultimately directing actions in a broad sense [11] Group 4: Understanding Language and AI Potential - There are fundamental differences in views regarding LLM, particularly concerning the understanding of predicting the next token [12] - Those who see predicting the next token as more than mere statistics are more inclined to recognize the potential of LLM and AI [12][19] - The ability to predict the next token effectively implies an understanding of the underlying reality that generates the token, which is a deeper question than it appears [18]