Workflow
自动驾驶之心
icon
Search documents
BigBite解析,Tesla FSD就是一个端到端大模型
自动驾驶之心· 2026-01-27 09:40
Core Viewpoint - Tesla's Full Self-Driving (FSD) is fundamentally a large model that utilizes a significant neural network architecture to achieve end-to-end driving capabilities, contrary to claims that it relies on numerous smaller models for various tasks [7][17]. Summary by Sections FSD Model Architecture - Tesla FSD is characterized as a large model, confirmed by Ashok at ICCV, which utilizes a massive neural network for computations from Photon In to Control Out [7][14]. - The architecture includes numerous model parameter files, which are primarily small task heads rather than independent models, indicating a more complex integration than previously assumed [6][10]. Parameter File Insights - The discovery of hundreds of neural network parameter files has led to skepticism about FSD being a large model; however, these files are largely associated with smaller tasks rather than the core end-to-end model [8][10]. - The parameter sizes for HW3 and HW4 show significant growth, with HW4's B core reaching 7.5GB, indicating a substantial increase in model complexity and capability [8][12]. Memory and Bandwidth Considerations - HW3's limited memory bandwidth of 68GB/s restricts the model size to approximately 1.8 billion parameters, while HW4's bandwidth of 384GB/s allows for a theoretical capacity of around 10 billion parameters [12][13]. - The use of a mixture of experts (MOE) architecture enables Tesla to optimize memory usage and enhance model performance without exceeding bandwidth limitations [13][16]. Technological Advancements - The assertion that Tesla's technology is outdated is challenged by the argument that significant engineering innovations contribute to advancements, similar to the development of reusable rockets [17]. - The integration of advanced engineering practices and innovative architectures positions Tesla as a leader in the autonomous driving sector, countering claims of technological inferiority [17].
轻舟智航L2/L4智驾方案解析:一段式、VLA和世界模型
自动驾驶之心· 2026-01-26 07:16
点击下方 卡片 ,关注" 自动驾驶之心 "公众号 戳我-> 领取 自动驾驶近30个 方向 学习 路线 21号,轻舟首个基于单征程6M的城市NOA方案,已正式上车理想L系列智能焕新版。23号轻舟开了一场发布 会,里面技术的部分,给大家分享一下。 单J6M实现一段式端到端+强化学习,说实话是有点东西的。 和大家一起拆解下整体的网络架构: 以上的部分是一个常见的OneModel架构,下面是不一样的地方: 后续利用Safe RL(增加规则的判断)进一步优化自车轨迹。这一套架构整体上来说,其实不复杂,难的是在 J6M 128TOPS的算力上实现。第一时间就有人问柱哥这是不是真的。 DiffusionDrive和Flow Matching已经是多家公司验证过可量产的算法了。有两个算法也推荐一下,Diffusion Planner和Flow Planner,Flow Planner是Diffusion Planner的改进版本,是清华AIR詹仙园老师团队下面的工作。 轻舟也放了几个困难场景的demo。下图是L2实车的表现,严重错位道路和复杂路口的无保护左转,效果都很 不错。严重错位的道路很考验静态的基本功,不止是道路/车道 ...
聚焦端到端的公司,越来越多了......
自动驾驶之心· 2026-01-25 10:07
Core Viewpoint - The article emphasizes the shift in the autonomous driving industry towards end-to-end solutions, with both large and small companies accelerating their transformation to adopt these models [2][4]. Group 1: Data Requirements and Model Development - Companies are exploring the data requirements for developing effective one-stage and two-stage models, with 2 million clips being sufficient for a decent two-stage model, while one-stage models require around 10 million clips [2][4]. - The necessity of simulation data (SD) for end-to-end models and potential pitfalls such as navigation failures are highlighted [4]. Group 2: Community and Knowledge Sharing - The "Autonomous Driving Heart Knowledge Planet" community has been established to provide a comprehensive platform for learning and sharing knowledge in the autonomous driving field, currently hosting nearly 4,500 members with a goal of reaching 10,000 in two years [5][17]. - The community offers a variety of resources including videos, articles, learning paths, and Q&A sessions, aimed at reducing the trial-and-error costs for newcomers [5][9]. Group 3: Technical Routes and Learning Resources - The community has compiled over 40 technical routes covering various aspects of autonomous driving, including VLA benchmarks, multi-modal models, and data annotation practices [7][18]. - Regular discussions with industry experts are held to explore trends, technology directions, and production challenges in autonomous driving [7][9]. Group 4: Job Opportunities and Career Development - The community facilitates job opportunities by connecting members with companies in the autonomous driving sector, providing insights into open positions and career paths [11][22]. - Members can receive guidance on research directions and job applications, enhancing their career prospects in the industry [11][91].
摸底GS重建在自动驾驶业内的岗位需求
自动驾驶之心· 2026-01-24 02:55
Core Viewpoint - The article discusses the growing demand for algorithm teams in the field of 3DGS (3D Gaussian Splatting) for autonomous driving, highlighting the need for skilled professionals and the development of a comprehensive training course to address this gap [2][3]. Group 1: Industry Demand and Job Roles - Companies are looking to invest in headcount (HC) for testing and closed-loop simulation in the autonomous driving sector, indicating a clear need for algorithm teams ranging from 5 to 20 members to support optimization in closed-loop simulations [2][3]. - The demand for cloud data production is also noted, particularly for static road surface reconstruction, which requires a minimum team size of around 10 people to meet basic functional needs [3]. Group 2: 3DGS Development and Learning Path - The article outlines a structured learning path for 3DGS, starting from static reconstruction to dynamic reconstruction and surface reconstruction, culminating in mixed scene reconstruction and feed-forward GS [3]. - A course titled "3DGS Theory and Algorithm Practical Tutorial" has been developed to provide a detailed roadmap for understanding 3DGS technology, covering principles and practical applications [3]. Group 3: Course Structure and Content - The course consists of six chapters, covering topics such as background knowledge, principles and algorithms of 3DGS, technical explanations for autonomous driving, important research directions, and feed-forward 3DGS [6][8][9][10][11][12]. - Each chapter is designed to build upon the previous one, ensuring a comprehensive understanding of 3DGS and its applications in the industry [8][9][10][11][12]. Group 4: Target Audience and Prerequisites - The course is aimed at individuals with a background in computer graphics, visual reconstruction, and related technologies, as well as those familiar with Python and PyTorch [17]. - Participants are expected to have a foundational understanding of probability theory and linear algebra, which are essential for mastering the 3DGS technology stack [17].
英伟达的汽车生意经
自动驾驶之心· 2026-01-24 02:55
Core Viewpoint - NVIDIA is transitioning from a hardware supplier to a comprehensive provider of autonomous driving solutions, focusing on a full-stack approach that includes cloud training, simulation, and in-vehicle inference capabilities [4][7]. Group 1: Three Pillars of Full-Stack Solutions - NVIDIA's automotive strategy is built on three main components: DGX for AI model training, OVX for simulation, and AGX for in-vehicle inference [8][20]. - DGX serves as an AI model training factory, utilizing a supercomputing cluster of thousands of GPUs to process vast amounts of driving data [11][12]. - OVX creates a virtual world that mirrors real-world conditions, allowing for extensive testing of autonomous driving algorithms without the risks and costs associated with real-world testing [13][14][16]. - AGX represents NVIDIA's well-known in-vehicle computing chips, which have evolved to provide significantly higher processing power, becoming standard in various flagship models [18][20]. Group 2: Business Model Evolution - NVIDIA's revenue model has shifted from solely selling hardware to offering engineering services, which include deep involvement in automakers' production projects [21][23]. - The company charges a one-time engineering service fee, akin to a "coaching fee," to assist automakers in optimizing their algorithms on NVIDIA's platform [24][25]. - This service model fosters a win-win situation, enhancing automakers' capabilities while providing NVIDIA with valuable feedback for continuous product improvement [25]. Group 3: Open Source Strategy - In early 2025, NVIDIA announced the open-sourcing of its Alpamayo series, which includes a large-scale reasoning model and a comprehensive simulation framework [28][29][30]. - This strategic move aims to lower industry barriers, expand the ecosystem, and establish NVIDIA as a leader in defining the next generation of autonomous driving technology [34][35]. - The open-source approach also serves to mitigate geopolitical risks by transforming core technologies into global public assets [34]. Group 4: Demand from the Chinese Market - NVIDIA's accelerated pace in the automotive sector is largely driven by demand from the Chinese market, which is ahead of overseas automakers by two to three years in smart vehicle development [38][40]. - The rapid iteration and high expectations for functionality from Chinese automakers have prompted NVIDIA to develop specialized tools like TensorRT-LLM for Auto in record time [38][40]. Group 5: Competitive Landscape - NVIDIA maintains confidence against competitors by emphasizing that the ultimate competition in smart driving lies in systemic engineering capabilities and a continuously evolving ecosystem [41][42]. - The company has built a comprehensive stack that includes chips, safety certifications, operating systems, middleware, and development tools, creating a high barrier to entry for competitors [42][44].
自驾有这方面经验的同学,在具身很抢手
自动驾驶之心· 2026-01-23 06:28
Core Insights - The article emphasizes the growing interest in the embodied AI industry, particularly for professionals with experience in end-to-end and large model training, as the industry seeks individuals with backgrounds in imitation learning and reinforcement learning [2][4]. - It highlights the relatively low entry requirements for new graduates, focusing on algorithm proficiency and problem-solving skills, particularly through platforms like LeetCode [3]. - The article discusses the high risks associated with entering the industry, suggesting that potential returns must be predictable to justify involvement [4]. Group 1: Community and Resources - The "Automated Driving Heart Knowledge Planet" community has been established to provide a comprehensive platform for learning and sharing knowledge in the autonomous driving field, aiming to grow from 4,000 to nearly 10,000 members in two years [9]. - The community offers a variety of resources, including videos, articles, learning paths, and Q&A sessions, to assist both beginners and advanced learners in navigating the complexities of autonomous driving technology [10][12]. - Members have access to over 40 technical routes and can engage with industry experts for insights on trends, technology directions, and production challenges [14][26]. Group 2: Learning and Career Development - The community provides structured learning paths for newcomers, covering essential topics such as end-to-end autonomous driving, multi-modal large models, and various algorithms [20][24]. - There are opportunities for job referrals within the community, connecting members with positions in leading autonomous driving companies [21][95]. - Regular discussions and Q&A sessions allow members to seek advice on career choices, research directions, and industry trends [98][103]. Group 3: Technical Focus Areas - The community has compiled extensive resources on various technical aspects of autonomous driving, including perception, simulation, planning, and control [26][47]. - Specific areas of focus include 3D object detection, world models, and the integration of multi-sensor data, which are critical for advancing autonomous driving technologies [43][51]. - The community also addresses emerging topics such as diffusion models and their applications in autonomous driving, providing members with insights into cutting-edge research [58].
小鹏组织架构新调整?副总裁、汽车互联网中心负责人魏斌休假
自动驾驶之心· 2026-01-22 09:07
Core Viewpoint - The article discusses the recent developments at XPeng Motors, particularly focusing on the vacation of Vice President Wei Bin and the ongoing organizational changes within the company, highlighting the importance of advancements in autonomous driving and AI capabilities in the automotive industry [3][5]. Group 1: Organizational Changes - Wei Bin, the Vice President and head of the Internet Center at XPeng, is currently on leave, which may be related to the intense R&D iterations within the company [5]. - XPeng has experienced rapid organizational changes over the past six months, including leadership shifts in the autonomous driving center and the Internet Center [5]. - The company is undergoing a critical phase of restructuring, which reflects its strategic focus on enhancing core technologies and improving organizational efficiency [7]. Group 2: Technological Advancements - Since joining XPeng in 2021, Wei Bin has been responsible for the Internet Center, which includes smart cockpit development and AI capabilities [6]. - Under Wei Bin's leadership, XPeng has shifted its smart cockpit system from a function-oriented approach to a core architecture upgrade focused on computing power and operating systems [6]. - The Internet Center plays a crucial role in building foundational work for the long-term evolution of AI capabilities in vehicles, emphasizing the integration of smart driving, smart cockpit, and vehicle platforms [6][7]. Group 3: Future Plans - XPeng plans to launch mass production of humanoid robots by 2026, indicating a broader vision for integrating robotics into their product offerings [4]. - The company aims to deepen the integration of smart driving and smart cockpit technologies with the overall vehicle platform, reinforcing its commitment to becoming an "AI car" leader [7].
2025年几家自动驾驶公司的采访总结
自动驾驶之心· 2026-01-22 09:07
Core Algorithm - The industry has shifted towards end-to-end solutions, moving away from modular approaches, at least in public discourse [1] - The introduction of world models is prevalent, with some companies using them to generate training data, while others incorporate them into end-to-end models to enhance performance [1][8] - There is a divergence in opinions regarding the necessity of language models (VLA) in autonomous driving, with some companies arguing that language is not essential for driving tasks [1][11] Simulation and Infrastructure - The closed-loop systems have evolved from data-driven to simulation testing and training loops [2] - 3DGS is highlighted as a crucial technology for building simulation environments, as emphasized by Tesla at CVPR 2025 [5] - Infrastructure is critical, with companies like Xiaomi and Li Auto noting its benefits for development efficiency [3][14] Organizational Capability - Organizational ability is vital, as large autonomous driving teams face significant management challenges [4] - Team culture and collaboration are emphasized as essential for overcoming complex technical and management issues [5] Technical Choices Comparison - A comparison of various companies' technical choices reveals differing approaches to core technologies and the role of world models and simulation tools [9] - Companies like Li Auto advocate for a training loop that evolves from imitation to self-learning, while NVIDIA emphasizes interpretability and reasoning in AI [9] Key Non-Core Factors - R&D infrastructure and engineering efficiency are crucial for the success of autonomous driving technologies [14] - Simulation and synthetic data are becoming essential for addressing corner cases that real-world data cannot cover [14] - The scale of computing power and chip adaptation is critical, as autonomous driving is not just a software issue but also a hardware challenge [15] User Experience and Safety - User experience and safety are paramount, with companies like Xiaomi stressing the importance of balancing advanced technology with user concerns [17] - The need for a dual-stack safety mechanism is highlighted, ensuring that even aggressive end-to-end models have a fallback to traditional rule-based systems for safety [19]
最近咨询世界模型岗位的同学越来越多了......
自动驾驶之心· 2026-01-22 00:51
Core Viewpoint - The article emphasizes the growing demand for positions in the field of autonomous driving, particularly in the areas of world models, end-to-end systems, and VLA, highlighting the importance of practical experience and advanced knowledge in these domains [2][4]. Course Overview - The course on world models in autonomous driving is being launched in collaboration with industry experts, focusing on various algorithms and applications, including Tesla's world model and the Marble project by Fei-Fei Li's team [2][4]. - The course aims to provide a comprehensive understanding of world models, covering their development history, current applications, and different approaches such as pure simulation, simulation + planning, and generative sensor input [7]. Course Structure - **Chapter 1: Introduction to World Models** This chapter reviews the relationship between world models and end-to-end autonomous driving, discussing the evolution and current applications of world models, as well as various streams within the field [7]. - **Chapter 2: Background Knowledge of World Models** This chapter covers foundational knowledge related to world models, including scene representation, Transformer technology, and BEV perception, which are crucial for understanding subsequent chapters [8][12]. - **Chapter 3: General World Model Exploration** Focuses on popular models such as Marble, Genie 3, and the latest discussions around VLA + world model algorithms, providing insights into their core technologies and design philosophies [9]. - **Chapter 4: Video Generation-Based World Models** This chapter delves into video generation algorithms, starting with notable works like GAIA-1 & GAIA-2 and extending to recent advancements, ensuring a balance between classic and cutting-edge research [10]. - **Chapter 5: OCC-Based World Models** Concentrates on OCC generation methods, discussing three major papers and a practical project, highlighting their applicability in trajectory planning and end-to-end systems [11]. - **Chapter 6: World Model Job Specialization** This chapter shares practical insights from the instructor's experience, addressing industry applications, pain points, and interview preparation for related positions [12]. Learning Outcomes - The course is designed to elevate participants to a level equivalent to one year of experience as a world model algorithm engineer, covering key technologies and enabling practical application in projects [15].
一位智驾算法工程师的跳槽复盘:焦虑与选择......
自动驾驶之心· 2026-01-22 00:51
Core Insights - The article discusses the evolving landscape of the AI and autonomous driving industry, highlighting the increasing complexity and competition in algorithm engineering roles, particularly as single-module algorithm positions are expected to decline [4][5][6]. Group 1: Industry Trends - The year 2025 is anticipated to be a pivotal year for the AI industry, marked by rapid advancements in large models and a shift in autonomous driving technology from end-to-end solutions to more complex architectures [4]. - There is a growing concern among industry professionals about being left behind, as many engineers are still working on outdated solutions while only a few have the opportunity to engage in cutting-edge projects [5]. Group 2: Job Market Insights - For algorithm engineers, potential job directions include L2 assisted driving, with opportunities primarily in traditional and new automotive manufacturers, as well as suppliers. However, newer roles in VLA (Vehicle Level Architecture) may offer more core positions [9]. - The L4 commercialization is expected to gain momentum in 2026, with companies expanding teams for applications such as robotaxis and autonomous logistics. These teams are generally smaller and have clearer business models [11]. - The concept of embodied intelligence is gaining traction, with numerous startups emerging. Current recruitment focuses on VLA, reinforcement learning, and motion control, while traditional perception algorithms are less in demand [14]. Group 3: Personal Career Decisions - The author ultimately chose to pursue opportunities in the L4 direction, which, while not the most popular, aligns better with personal judgment and market trends [15].