Workflow
VLA
icon
Search documents
扩散模如何重塑自动驾驶轨迹规划?
自动驾驶之心· 2025-09-11 23:33
Core Viewpoint - The article discusses the significance and application of Diffusion Models in various fields, particularly in autonomous driving, emphasizing their ability to denoise and generate data effectively [1][2][11]. Summary by Sections Introduction to Diffusion Models - Diffusion Models are generative models that focus on denoising, learning the distribution of data through a forward diffusion process and a reverse generation process [2][4]. - The concept is illustrated through the analogy of ink dispersing in water, where the model aims to recover the original data from noise [2]. Applications in Autonomous Driving - In the field of autonomous driving, Diffusion Models are utilized for data generation, scene prediction, perception enhancement, and path planning [11]. - They can handle both continuous and discrete noise, making them versatile for various decision-making tasks [11]. Course Offering - The article promotes a new course on end-to-end and VLA (Vision-Language Alignment) algorithms in autonomous driving, developed in collaboration with top industry experts [14][17]. - The course aims to address the challenges faced by learners in keeping up with rapid technological advancements and fragmented knowledge in the field [15][18]. Course Structure - The course is structured into several chapters, covering topics such as the history of end-to-end algorithms, background knowledge on VLA, and detailed discussions on various methodologies including one-stage and two-stage end-to-end approaches [22][23][24]. - Special emphasis is placed on the integration of Diffusion Models in multi-modal trajectory prediction, highlighting their growing importance in the industry [28]. Learning Outcomes - Participants are expected to achieve a level of understanding equivalent to one year of experience as an end-to-end autonomous driving algorithm engineer, mastering key frameworks and technologies [38][39]. - The course includes practical components to ensure a comprehensive learning experience, bridging theory and application [19][36].
2025年,盘一盘中国智驾的自动驾驶一号位都有谁?
自动驾驶之心· 2025-09-10 23:33
Core Viewpoint - The automatic driving industry is undergoing a significant technological shift towards "end-to-end" solutions, driven by Tesla's leadership and advancements in large model technologies. This shift is prompting domestic automakers to increase investments and adjust their structures, making "end-to-end" a mainstream production solution by 2024 [1]. Group 1: Key Figures in Automatic Driving - The article highlights key figures in China's automatic driving sector, focusing on those who directly influence technology routes and team growth [1]. - Notable leaders include: - **Lang Xianpeng** from Li Auto, who has led advancements in assisted driving technology, including the launch of full-scene NOA and the no-map NOA feature [5]. - **Ye Hangjun** from Xiaomi, who has been pivotal in the development of Xiaomi's end-to-end driving system and has overseen multiple cutting-edge projects [7][9]. - **Ren Shaoqing** from NIO, who has significantly contributed to the development of urban NOA and emphasizes the importance of data in smart driving [11]. - **Li Liyun** from XPeng, who has taken over leadership in smart driving and focuses on a pure vision solution [14][15]. - **Yang Dongsheng** from BYD, who has led the development of the DM-i hybrid system and is pushing for the integration of advanced driving systems across all BYD models [17][20]. - **Su Jing** from Horizon Robotics, who is leading the development of end-to-end HSD solutions [21][22]. - **Cao Xudong** from Momenta, who has developed a data-driven strategy for autonomous driving and is focusing on end-to-end large models [25][26]. Group 2: Technological Trends and Innovations - The article discusses the technological evolution in the automatic driving sector, emphasizing the transition to end-to-end architectures and the emergence of large models, world models, and VLM solutions [1][53]. - Companies are adopting various strategies: - Li Auto is focusing on E2E and VLA systems [5]. - Xiaomi is heavily investing in end-to-end technology with significant output [9]. - NIO is pursuing a world behavior model approach [11]. - XPeng is committed to a pure vision strategy [15]. - BYD is integrating advanced driving systems across its entire lineup [20]. - Momenta is leveraging a dual strategy of L2 and L4 development to enhance its market position [26]. Group 3: Future Outlook - The article concludes that the leaders in the automatic driving industry are crucial in shaping the future of smart driving in China, with a shared goal of creating systems that are safe, reliable, and tailored to local conditions [51][53]. - The ongoing competition and collaboration among these leaders will drive the industry towards more intelligent and user-friendly solutions [51].
自动驾驶中有“纯血VLA"吗?盘点自动驾驶VLM到底能起到哪些作用~
自动驾驶之心· 2025-09-06 16:05
Core Viewpoint - The article discusses the challenges and methodologies involved in developing datasets for autonomous driving, particularly focusing on the VLA (Visual Language Action) model and its applications in trajectory prediction and scene understanding [1]. Dataset Handling - Different datasets have varying numbers of cameras, and the VLM model can handle this by automatically processing different image token inputs without needing explicit camera counts [2] - The output trajectories are based on the vehicle's current coordinate system, with predictions given as relative (x, y) values rather than image coordinates, requiring additional camera parameters for mapping to images [6] - The VLA model's output format is generally adhered to, but occasional discrepancies occur, which are corrected through Python programming for format normalization [8][9] Trajectory Prediction - VLA trajectory prediction differs from traditional methods by incorporating scene understanding capabilities through QA training, enhancing the model's ability to predict trajectories of dynamic objects like vehicles and pedestrians [11] - The dataset construction faced challenges such as data quality issues and inconsistencies in coordinate formats, which were addressed through rigorous data cleaning and standardization processes [14][15] Data Alignment and Structure - Data alignment is achieved by converting various dataset formats into a unified relative displacement in the vehicle's coordinate system, organized in a QA format that includes trajectory prediction and dynamic object forecasting [18] - The input data format consists of images and trajectory points from the previous 1.5 seconds to predict future trajectory points over 5 seconds, adhering to the SANA standard [20] Community and Resources - The "Autonomous Driving Heart Knowledge Planet" community focuses on cutting-edge technologies in autonomous driving, covering nearly 40 technical directions and fostering collaboration between industry and academia [22][24] - The community offers a comprehensive platform for learning, including video tutorials, Q&A sessions, and job opportunities in the autonomous driving sector [28][29]
谈谈Diffusion扩散模型 -- 从图像生成到端到端轨迹规划~
自动驾驶之心· 2025-09-06 11:59
Core Viewpoint - The article discusses the significance and application of Diffusion Models in various fields, particularly in autonomous driving, emphasizing their ability to denoise and generate data effectively [1][2][11]. Summary by Sections Introduction to Diffusion Models - Diffusion Models are generative models that focus on denoising, where noise follows a specific distribution. The model learns to recover original data from noise through a forward diffusion process and a reverse generation process [1][2]. Applications in Autonomous Driving - In the field of autonomous driving, Diffusion Models are utilized for data generation, scene prediction, perception enhancement, and path planning. They can handle both continuous and discrete noise, making them versatile for various decision-making tasks [11]. Course Overview - The article promotes a new course titled "End-to-End and VLA Autonomous Driving," developed in collaboration with top algorithm experts. The course aims to provide in-depth knowledge of end-to-end algorithms and VLA technology [15][22]. Course Structure - The course is structured into several chapters, covering topics such as: - Comprehensive understanding of end-to-end autonomous driving [18] - In-depth background knowledge including large language models, BEV perception, and Diffusion Model theory [21][28] - Exploration of two-stage and one-stage end-to-end methods, including the latest advancements in the field [29][36] Learning Outcomes - Participants are expected to gain a solid understanding of the end-to-end technology framework, including one-stage, two-stage, world models, and Diffusion Models. The course also aims to enhance knowledge of key technologies like BEV perception and reinforcement learning [41][43].
锦秋基金被投地瓜机器人:从VGGT到数据闭环,具身智能的突破与探索
锦秋集· 2025-09-03 04:30
Core Viewpoint - The article discusses the transition from autonomous driving technology to robotics, highlighting the challenges and opportunities in the robotics industry, particularly in the context of embodied intelligence and the potential impact of new models like VGGT on 3D perception and robotics applications [5][7][60]. Group 1: Industry Trends - The robotics industry is at a pivotal moment, with significant technological advancements and a shift towards embodied intelligence, which is seen as the next frontier for AI [5][7]. - The article emphasizes the differences between the autonomous driving and robotics sectors, noting that while autonomous driving has reached a level of standardization, robotics is still exploring diverse hardware forms and algorithms [10][14]. - The VGGT model is introduced as a potential game-changer for 3D geometry, akin to how Transformers revolutionized natural language processing, indicating a shift towards unified solutions for 3D perception [6][67]. Group 2: Technological Migration - The migration of technology from autonomous driving to robotics is highlighted, with companies like DiGua Robotics leveraging experiences from the autonomous driving sector to enhance their robotics platforms [14][18]. - The challenges of hardware diversity in robotics are discussed, as the lack of standardization complicates data accumulation and algorithm development [10][14]. - The article outlines the evolution of autonomous driving algorithms from modular approaches to end-to-end systems, which are now being adapted for robotics applications [25][27]. Group 3: VGGT and Its Implications - VGGT is presented as a foundational model that could redefine 3D visual technology, offering a new paradigm for solving traditional geometric problems through large-scale data and models [55][67]. - The potential for VGGT to replace expensive depth cameras with cheaper RGB cameras is discussed, which could significantly reduce the cost of robotics systems [64][66]. - The article concludes that VGGT represents a significant advancement in the field of 3D vision, marking the entry of large models into the realm of geometric processing [67][68].
Tier 1一哥博世端到端终于走到量产,还是一段式!
自动驾驶之心· 2025-08-30 16:03
Core Viewpoint - The article discusses the advancements in autonomous driving technology, particularly focusing on WePilot AiDrive, a new end-to-end ADAS solution developed by WeRide, which aims to enhance the driving experience and safety through advanced AI capabilities [5][9][10]. Group 1: WeRide's New Technology - WeRide has launched a new end-to-end ADAS solution named WePilot AiDrive, which is set to be mass-produced within the year [5]. - The system integrates sensor data input and vehicle trajectory output into a single model, enhancing the efficiency and responsiveness of autonomous driving [10][24]. - The new system demonstrates improved performance in complex driving scenarios, such as navigating through urban villages and recognizing pedestrians in challenging lighting conditions [12][14][24]. Group 2: Comparison with Previous Systems - The previous two-stage model used separate perception and control models, which often led to data loss and limited understanding of driving environments [25][30]. - The new one-stage model allows for direct learning of the relationship between input data and output trajectories, significantly improving the system's performance [33]. - The transition from a rule-based approach to a more integrated model aims to overcome the limitations of earlier systems, which struggled with generalization and adaptability [32][35]. Group 3: Market Implications - The collaboration between WeRide and Bosch aims to make advanced driving capabilities accessible across various vehicle price segments, not just high-end models [41][44]. - Currently, less than 20% of vehicles in the Chinese market are equipped with advanced intelligent driving features, indicating significant growth potential for WeRide's technology [42]. - The goal is to push L2+ capabilities beyond the "value inflection point," making advanced driving technology more mainstream [44].
华为坚定不走VLA路线,WA才是自动驾驶终极方案?
自动驾驶之心· 2025-08-29 16:03
Core Viewpoint - Huawei's automotive business has achieved significant milestones, including 1 million vehicles equipped with its driving technology and over 100 million units of laser radar shipped, showcasing its long-term strategic vision in the automotive sector [3][4]. Group 1: Achievements and Strategy - As of July, 1 million vehicles have been equipped with Huawei's QianKun intelligent driving system, and the cumulative mileage for assisted driving has reached 4 billion kilometers [3]. - Huawei's automotive business has been investing since 2014, focusing on R&D rather than immediate commercialization, which has led to current profitability [4][5]. - The company has launched 28 models in collaboration with various brands, indicating a strong market presence [3]. Group 2: Technology Approach - Huawei prefers the World Action (WA) model over the Video Language Action (VLA) model for achieving true autonomous driving, believing WA is a more direct and effective approach [5][13]. - The WA model processes information directly from various inputs like vision, sound, and touch, bypassing the need to convert data into language [5][14]. - Huawei has developed the WEWA model based on the WA architecture, which will be deployed in ADS 4.0 [6]. Group 3: Business Model and Pricing - Huawei's CEO emphasizes that there is no such thing as a free service in the automotive industry; costs are often hidden or transferred [7][17]. - The company believes charging for assisted driving systems is justified due to ongoing costs for updates and maintenance throughout the vehicle's lifecycle [8][18]. - Huawei's approach to lifecycle management ensures that users receive continuous upgrades, enhancing their experience over time [18]. Group 4: Future Plans - Huawei aims to achieve L3 capabilities for highway driving and L4 pilot capabilities in urban areas by 2026, with plans for large-scale commercial use by 2028 [11]. - The company is also working on transforming the intelligent cockpit into a "digital nanny," integrating AI to enhance user experience [11]. Group 5: Safety and Technology Enhancements - Huawei's increase in sensor configurations, such as additional laser radars, is driven by a commitment to safety rather than merely increasing product pricing [19][20]. - The company focuses on enhancing the precision of its systems to prevent accidents and improve user safety in various driving scenarios [20][22].
车展季·大咖说丨VLA计划9月“上车” 何小鹏谈与特斯拉市值差距:智能化能力尚未完全体现
Mei Ri Jing Ji Xin Wen· 2025-08-28 15:18
Core Viewpoint - The launch of the new XPeng P7 is seen as a strategic move to regain a top position in the electric sedan market priced above 200,000 yuan, with a focus on advanced technology and design [1][2]. Group 1: Product Launch and Market Positioning - The new XPeng P7 was launched with four Ultra versions priced between 219,800 and 301,800 yuan, aiming to compete with models like the Xiaomi SU7 and Tesla Model 3 [1][5]. - The target for the new P7 is to achieve sales that place it among the top three in the competitive 200,000 to 250,000 yuan electric sedan market, which has seen a 60% year-on-year sales increase [2][5]. - The P7 received over 10,000 pre-orders within 7 minutes of its launch, indicating strong market interest [5]. Group 2: Production and Profitability - XPeng is ramping up production capacity for the P7, with a focus on quality control and a rapid production pace, aiming for a monthly sales target of around 4,200 units to secure a top market position [5][6]. - The P7 is expected to enhance the company's overall gross margin, with analysts suggesting it could act as a profitability accelerator for XPeng [5][6]. Group 3: Technological Advancements - The P7 features three Turing AI chips, enhancing its intelligent driving capabilities, with a significant update expected by the end of the year [6][7]. - The company has invested nearly 5 billion yuan in the VLA (Vision-Language-Action) technology, indicating a strong commitment to AI development in the automotive sector [10]. Group 4: Future Outlook - XPeng's CEO anticipates that the automotive industry will see higher profit margins in the AI era, with a potential shift from small to significant profits as technology and production capabilities improve [11]. - Despite current challenges, XPeng aims to close the valuation gap with leading competitors like Tesla, with expectations of improved market recognition in the near future [11].
具身智能之心技术交流群成立了!
具身智能之心· 2025-08-28 08:36
Group 1 - The establishment of the Embodied Intelligence Heart Technology Exchange Group focuses on various advanced technologies including VLA, VLN, remote operation, Diffusion Policy, reinforcement learning, VLA+RL, sim2real, multimodal large models, simulation, motion control, target navigation, mapping and localization, and navigation [1] - Interested individuals can add the assistant's WeChat AIDriver005 to join the community [2] - To expedite the group entry process, it is advised to include a note with the institution/school, name, and research direction [3]
自动驾驶之心业务合伙人招募来啦!模型部署/VLA/端到端方向~
自动驾驶之心· 2025-08-28 08:17
Core Viewpoint - The article emphasizes the recruitment of business partners for the autonomous driving sector, highlighting the need for expertise in various advanced technologies and offering attractive incentives for potential candidates [2][3][5]. Group 1: Recruitment Details - The company plans to recruit 10 outstanding partners for autonomous driving-related course development, research paper guidance, and hardware development [2]. - Candidates with expertise in large models, multimodal models, diffusion models, and other advanced technologies are particularly welcome [3]. - Preferred qualifications include a master's degree or higher from universities ranked within the QS200, with priority given to candidates with significant conference contributions [4]. Group 2: Incentives and Opportunities - The company offers resource sharing related to autonomous driving, including job recommendations, PhD opportunities, and study abroad guidance [5]. - Attractive cash incentives are part of the compensation package for successful candidates [5]. - Opportunities for collaboration on entrepreneurial projects are also available [5].