Workflow
端到端
icon
Search documents
马斯克宣布:无方向盘时代正式倒计时
老徐抓AI趋势· 2025-11-06 01:12
Core Insights - Tesla is approaching a significant milestone in autonomous driving with the announcement of the Cybercab, a vehicle without a steering wheel or pedals, set to begin production in Q2 of next year, indicating a paradigm shift in the automotive industry [2][5][17] - The transition from a rule-based system to an end-to-end AI learning model marks a revolutionary change in Tesla's approach to autonomous driving, enhancing safety and efficiency [10][11][12] Group 1: Autonomous Driving Technology - Tesla's autonomous driving system relies on an end-to-end AI model that learns from vast amounts of real-world driving data, totaling 60 billion miles, allowing it to recognize and react to complex driving scenarios [10][11] - The recent FSD V12 version has eliminated 330,000 lines of code, fully transitioning to a neural network-based system, which has shown improved performance and human-like driving behavior [11][12] - Tesla's AI model is designed to be interpretable, allowing users to understand the reasoning behind its decisions, enhancing safety and regulatory compliance [12] Group 2: Market Implications - The removal of the steering wheel signifies a major shift in the automotive ecosystem, potentially impacting the used car market as vehicles lacking full autonomous capabilities may see a decline in resale value [17][19] - The year 2026 is projected to be pivotal for Tesla, with the potential for a significant increase in stock value similar to the surge experienced in 2019-2020, driven by advancements in autonomous technology [19][31] - Tesla's ambitions extend beyond cars, aiming to apply its AI technology to various mobile objects, redefining human-machine relationships and potentially transforming multiple industries [20][22]
IPO前夜互掐,一场价值超90亿元的口水战
虎嗅APP· 2025-11-04 13:34
Core Viewpoint - The article discusses the competitive clash between two autonomous driving companies, Xiaoma Zhixing and Wenyuan Zhixing, as they prepare for their upcoming listings in Hong Kong. The conflict centers around data scale and technological pathways, which are critical for valuation in the autonomous driving industry [6][11][20]. Group 1: Competitive Dynamics - Xiaoma Zhixing and Wenyuan Zhixing are engaged in a public dispute over operational data and technology claims, with Xiaoma accusing Wenyuan of having zero orders and limited operational cities [6][9]. - Wenyuan's CFO, Li Xuan, responded by refuting Xiaoma's claims, emphasizing that Xiaoma's actions exceed normal competitive behavior and contain misleading statements [6][11]. - Both companies are vying for market share and technological leadership in the autonomous driving sector, particularly focusing on the total mileage driven by their fleets as a key performance indicator [11][12]. Group 2: Technological Focus - The debate highlights the importance of the "end-to-end" technology approach, which is seen as the next generation of autonomous driving solutions. This method requires significant restructuring of technical teams [13]. - Wenyuan claims to have achieved mass production with its "end-to-end" solution in collaboration with Bosch and Chery, while criticizing Xiaoma's claims of having a similar capability [12][13]. - The ability to keep pace with cutting-edge technology directly impacts the companies' innovation image and market valuation [13][20]. Group 3: Financial Performance and Market Position - Xiaoma Zhixing reported a net loss of 681 million yuan in the first half of 2025, a year-on-year increase of approximately 75.07%, while Wenyuan Zhixing's net loss was 792 million yuan, a decrease of 10.32% [18]. - As of the latest reports, Xiaoma's market capitalization stands at approximately $7.08 billion, while Wenyuan's is around $3.41 billion, despite Wenyuan having a higher gross margin [19]. - Xiaoma plans to raise about 6.71 billion HKD (approximately $864 million) through its Hong Kong listing, focusing on scaling and research and development [19][20].
端到端和VLA,这些方向还适合搞研究
自动驾驶之心· 2025-11-03 00:04
Core Viewpoint - The article discusses the evolution of autonomous driving technology, highlighting the transition from rule-based systems to end-to-end models represented by companies like Ideal and XPeng, and currently to the world model phase represented by NIO, emphasizing the continuous presence of deep learning throughout these changes [1]. Group 1: Course Introduction - The course covers the development from modular production algorithms to end-to-end systems and now to VLA, focusing on core algorithms such as BEV perception, visual language models (VLM), diffusion models, reinforcement learning, and world models [5]. - Participants will gain a comprehensive understanding of the end-to-end technology framework and key technologies, enabling them to reproduce mainstream algorithm frameworks like diffusion models and VLA [5]. - Feedback indicates that students completing the course can achieve approximately one year of experience as end-to-end autonomous driving algorithm engineers, benefiting from the training for internships and job recruitment [5]. Group 2: Instructor Profile - The main instructor, Jason, holds a C9 undergraduate degree and a PhD from a QS top 50 university, with multiple published papers in CCF-A and CCF-B journals [6]. - He is currently an algorithm expert at a leading domestic manufacturer, engaged in the research and production of cutting-edge algorithms, with extensive experience in the development and delivery of autonomous driving perception and end-to-end algorithms [6]. Group 3: Research Guidance - The program aims to enhance practical skills and knowledge in cutting-edge topics, with a focus on helping students publish high-level papers to improve their academic prospects [8]. - The community includes over 300 instructors specializing in autonomous driving and embodied intelligence, with a high manuscript acceptance rate of 96% over the past three years [8]. Group 4: Research Process - The guidance process includes selecting research topics based on student interests, explaining key concepts, and providing essential foundational knowledge and recommended learning materials [11]. - Students will learn how to critically read literature, conduct research, and write various sections of a paper, including methods and experimental results, with continuous feedback and support throughout the process [11].
摇人!寻找散落在各地的自动驾驶热爱者(产品/4D标注/世界模型等)
自动驾驶之心· 2025-10-25 16:03
Core Viewpoint - The article emphasizes the need for collaboration in the autonomous driving industry, inviting professionals to participate in training, course development, and research support to drive industry progress [2]. Group 1: Collaboration and Opportunities - The company is seeking partnerships with professionals in the autonomous driving field to enhance training and job guidance services [2]. - High compensation and abundant industry resources will be provided to collaborators [3]. - The main focus areas for collaboration include roles such as autonomous driving product managers, 4D annotation/data loop, world models, VLA, autonomous driving large models, reinforcement learning, and end-to-end systems [4]. Group 2: Training and Development - The positions are primarily aimed at B2B training for enterprises, universities, and research institutions, as well as C2C training for students and job seekers [5]. - The company encourages interested individuals to reach out for further consultation via WeChat [6].
VLA/世界模型/WA/端到端是宣传分歧, 不是技术路线分歧
理想TOP2· 2025-10-25 05:21
Core Viewpoints - Many people are unaware that there is no universally accepted definition of VLA/world model/end-to-end [1] - Leading autonomous driving companies share more commonalities in their exploration of autonomous driving than the differences portrayed online, with the core being promotional divergence rather than technical route divergence [1][2] - Language plays a significant role in autonomous driving, particularly in long reasoning, user interaction value alignment, and understanding the world [1] - Those who believe that predicting the next token is more than just a probability distribution are more likely to accept that language can understand the world [1] Group 1: VLA/World Model/End-to-End - VLA, world model, and end-to-end all require the ability to generate road video data that appears real, focusing on visual information input and ultimately controlling vehicle actions [2] - The distinction lies in the involvement of language, its depth of participation, and the architectural form it takes, with future language-related tokens potentially being LLM's text tokens or photon tokens [2] - The narrative that VLA and world models represent different technical routes is misleading, as both need to generate a world model and understand the physical world [4] Group 2: End-to-End Definitions - The definition of end-to-end is often debated, with some believing it requires a core framework where input and output are clearly defined [5] - Tesla's approach, which involves visual input and outputting trajectory rather than direct control signals, raises questions about the true nature of their end-to-end definition [5][6] - The output of precise trajectories is preferred over direct control signals, suggesting a more effective design approach [6] Group 3: Tesla's Approach and Future Directions - Tesla's historical context and style suggest that their approach to end-to-end definitions may not have a universally accepted exclusivity [7] - Long-term predictions indicate that AI model inputs and outputs may predominantly involve photons, which could significantly reduce computational loads [10] - The ideal VLA model is defined as having visual or multimodal input, language participation, and ultimately directing actions in a broad sense [11] Group 4: Understanding Language and AI Potential - There are fundamental differences in views regarding LLM, particularly concerning the understanding of predicting the next token [12] - Those who see predicting the next token as more than mere statistics are more inclined to recognize the potential of LLM and AI [12][19] - The ability to predict the next token effectively implies an understanding of the underlying reality that generates the token, which is a deeper question than it appears [18]
自动驾驶之心合伙人招募!
自动驾驶之心· 2025-10-24 16:03
Group 1 - The article announces the recruitment of 10 outstanding partners for the autonomous driving sector, focusing on course development, paper guidance, and hardware research [2] - The main areas of expertise sought include large models, multimodal models, diffusion models, end-to-end systems, embodied interaction, joint prediction, SLAM, 3D object detection, world models, closed-loop simulation, and model deployment and quantization [3] - Candidates are preferred from QS200 universities with a master's degree or higher, especially those with significant contributions to top conferences [4] Group 2 - The compensation package includes resource sharing for job seeking, doctoral studies, and overseas study recommendations, along with substantial cash incentives and opportunities for entrepreneurial project collaboration [5] - Interested parties are encouraged to add WeChat for consultation, specifying "organization/company + autonomous driving cooperation inquiry" [6]
自动驾驶“黑话”使用手册:新势力造车又“造词”
3 6 Ke· 2025-10-20 08:33
Core Insights - The automatic driving industry is experiencing a battle for narrative control over next-generation technologies, with companies like Li Auto and XPeng betting on VLA (Visual Language Action) as the future architecture, while Huawei criticizes it as a shortcut and promotes its own WA (World Behavior Architecture) [1][2][3] - The rapid emergence of jargon in the industry reflects the struggle for technological branding, as hardware becomes increasingly homogeneous and intelligent driving capabilities become the key differentiator [1][2][3] Group 1: Evolution of Terminology - Before 2022, the automatic driving industry's technical evolution was primarily defined by Tesla and Waymo, with terms being objective descriptions of specific functions [3] - Tesla's AI Day events in 2021 and 2022 significantly influenced the industry, introducing the BEV+Transformer architecture, which improved perception capabilities by integrating multiple camera inputs into a unified 3D view [3][4] - The transition to an "end-to-end" paradigm began in 2022, breaking down the barriers between perception and planning, with Tesla's FSD Beta V12 showcasing a large neural network that processes both simultaneously [5][6] Group 2: Technological Developments - Chinese automakers quickly adopted Tesla's advancements, with companies like XPeng and NIO implementing their own versions of the BEV+Transformer architecture for mass production [4][6] - The industry is moving towards a more integrated approach, with XPeng and Huawei adopting multi-stage end-to-end systems, while NIO is restructuring to focus on end-to-end development [7][8] - The introduction of VLA and world models into the automatic driving sector reflects a shift towards more sophisticated AI models that can understand and respond to complex driving scenarios [9][10][13] Group 3: Competitive Landscape - The competition in computing power is intensifying, with XPeng and Li Auto investing heavily in both vehicle and cloud computing capabilities, aiming to develop larger parameter models for their systems [11][12][36] - The VLA model, initially developed for robotics, is being adapted for automatic driving, with companies like Yuanrong Qixing leading the charge in applying this technology [10][31] - NIO and Huawei are taking a more aggressive approach by deploying world models directly in vehicles for real-time control, although the technology is still in the experimental stage [14][15] Group 4: Future Directions - The evolution of automatic driving terminology indicates a broader exploration of technology, with each new term representing a step in the industry's journey [16] - The ultimate success in the automatic driving sector may hinge on the ability to translate technological promises into tangible user experiences, rather than merely introducing new concepts [16]
新势力卖车,为何满嘴“黑话”?
Hu Xiu· 2025-10-20 07:22
Core Insights - The automatic driving industry is experiencing a battle for narrative control over next-generation technologies, with companies like Li Auto and XPeng betting on VLA (Visual Language Action) as the future architecture, while Huawei promotes its self-developed WA (World Behavior Architecture) [1][2][20] - The rapid emergence of jargon in the industry reflects the struggle for technological branding and user perception, as hardware and configurations become increasingly homogeneous [1][2][27] Group 1: Evolution of Technology - Before 2022, the evolution of automatic driving technology was primarily defined by Tesla and Waymo, with terminology focused on objective descriptions of specific functions [3] - Tesla's introduction of the BEV+Transformer architecture in 2021 marked a significant shift from rule-based systems to AI-driven approaches, enhancing perception capabilities [4][5][6] - The transition to an end-to-end paradigm was catalyzed by Tesla's AI DAY in 2022, which integrated perception and planning into a single neural network, significantly improving obstacle recognition [9][10] Group 2: Adoption of New Models - Chinese automakers quickly adopted Tesla's technology, with companies like XPeng and NIO implementing their own versions of the BEV+Transformer model for mass production [8][10] - The industry is moving towards end-to-end systems, with XPeng and Huawei initially adopting a multi-stage approach for safety reasons, before transitioning to fully integrated models [10][12] - The introduction of VLA and world models into automatic driving systems represents a new frontier, with companies like Yuanrong Qixing and NIO leading the charge in applying these concepts [17][20] Group 3: Competitive Landscape - The competition among companies is not only about technology but also about computational power, with XPeng and Li Auto investing heavily in cloud computing capabilities, boasting figures of 10 EFlops and over 13 EFlops respectively [18][19][55] - The race for computational resources extends to both vehicle and cloud platforms, with Tesla's Dojo and other companies ramping up their AI training capabilities [18][57] - The rapid evolution of VLA and world models is indicative of a broader trend where companies are leveraging advanced AI techniques to enhance their automatic driving systems [20][46] Group 4: Future Directions - The world model concept, initially used for simulation, is now being applied in real-time vehicle control by companies like NIO and Huawei, aiming for more predictive and human-like driving experiences [20][24][25] - The emergence of terms like VLA and world models highlights the industry's shift towards integrating language understanding and real-time decision-making into automatic driving systems [46][59] - The ultimate success in this competitive landscape may hinge on a company's ability to translate technological promises into tangible user experiences, rather than merely marketing jargon [30][29]
工业界和学术界都在怎么搞端到端和VLA?
自动驾驶之心· 2025-10-17 00:03
Core Insights - The article discusses the evolution of end-to-end algorithms in autonomous driving, highlighting the transition from modular production algorithms to end-to-end and now to Vision-Language Alignment (VLA) models [1][3] - It emphasizes the rich technology stack involved in end-to-end algorithms, including BEV perception, visual language models (VLM), diffusion models, reinforcement learning, and world models [3] Summary by Sections End-to-End Algorithms - End-to-end algorithms are categorized into two main paradigms: single-stage and two-stage, with UniAD being a representative of the single-stage approach [1] - Single-stage can further branch into various subfields, particularly those based on VLA, which have seen a surge in related publications and industrial applications in recent years [1] Courses Offered - The article promotes two courses: "End-to-End and VLA Autonomous Driving Small Class" and "Practical Course on Autonomous Driving VLA and Large Models," aimed at helping individuals quickly and efficiently enter the field [3] - The "Practical Course" focuses on VLA, covering topics from VLM as an autonomous driving interpreter to modular and integrated VLA, along with detailed theoretical foundations [3][12] Instructor Team - The instructor team includes experts from both academia and industry, with backgrounds in multi-modal perception, autonomous driving VLA, and large model frameworks [8][11][14] - Notable instructors have published numerous papers in top-tier conferences and have extensive experience in research and practical applications in autonomous driving and large models [8][11][14] Target Audience - The courses are designed for individuals with a foundational understanding of autonomous driving, familiar with basic modules, and have knowledge of transformer models, reinforcement learning, and BEV perception [15][17]
开放几个自动驾驶技术交流群(世界模型/端到端/VLA)
自动驾驶之心· 2025-10-13 23:33
Group 1 - The establishment of a technical exchange group focused on autonomous driving technology has been announced, covering areas such as world models, end-to-end systems, and VLA [1] - The company invites interested individuals to join the discussion by adding a designated assistant on WeChat with specific instructions for group entry [1]