端到端
Search documents
任少卿的智驾非共识:世界模型、长时序智能体与 “变态” 工程主义
自动驾驶之心· 2025-10-11 16:03
Core Viewpoint - The article discusses the innovative approach of NIO in the field of autonomous driving, emphasizing the importance of world models and reinforcement learning in achieving advanced AI capabilities, particularly in the context of self-driving technology [5][11][13]. Group 1: Company Background and Leadership - NIO is led by Ren Shaoqing, a young technical leader with a strong background in AI and deep learning, having co-founded the autonomous driving company Momenta before joining NIO [6][8]. - Ren Shaoqing has taken on the challenge of developing NIO's second-generation platform from scratch, focusing on building a robust data system to support autonomous driving capabilities [6][8]. Group 2: Technological Innovations - NIO's approach combines high computing power, multiple sensors, and a new architecture based on world models and reinforcement learning, which is considered a more challenging but potentially more effective path [8][9]. - The world model aims to establish a high-bandwidth cognitive system that can understand and predict physical interactions in the real world, addressing the limitations of language models [20][25]. Group 3: Reinforcement Learning and Data Systems - The company emphasizes the significance of reinforcement learning in developing long-term planning capabilities for autonomous driving, moving beyond traditional imitation learning [7][60]. - NIO has developed a three-tier data system to enhance data quality and training efficiency, which is crucial for building effective autonomous driving models [74][76]. Group 4: Market Position and Future Outlook - NIO aims to lead the industry by integrating world models into its autonomous driving technology, positioning itself ahead of competitors who primarily rely on language models [66][67]. - The company is focused on achieving open-set interaction capabilities, allowing users to communicate with the vehicle in a more natural and flexible manner [36][39].
任少卿的智驾非共识:世界模型、长时序智能体与 “变态” 工程主义
晚点Auto· 2025-10-09 12:17
Core Viewpoint - The article discusses the innovative approach of NIO in the field of autonomous driving, emphasizing the importance of world models and reinforcement learning as key components for achieving advanced artificial general intelligence (AGI) in automotive technology [4][9][26]. Group 1: NIO's Approach to Autonomous Driving - NIO is positioning itself as an AI company, focusing on the development of autonomous driving technology through a unique combination of high computing power, multiple sensors, and a new architecture based on world models and reinforcement learning [5][8][34]. - The company has established a three-layer data system to support its autonomous driving capabilities, which is considered one of the most advanced in the industry [36][54]. - NIO's strategy involves a shift from traditional end-to-end models to a more complex world model that integrates spatial and temporal understanding, aiming to enhance the vehicle's ability to navigate real-world scenarios [10][13][26]. Group 2: Reinforcement Learning and World Models - Reinforcement learning is viewed as essential for developing long-term decision-making capabilities in autonomous systems, moving beyond short-term imitation learning [7][29][33]. - The world model is defined as a high-bandwidth cognitive system that allows AI to understand and predict physical interactions in the environment, which is crucial for effective autonomous driving [10][16][26]. - NIO believes that the integration of language models with world models will lead to a more comprehensive understanding of both concepts and physical realities, ultimately contributing to the development of AGI [13][28][33]. Group 3: Data Utilization and Training - NIO utilizes a combination of real-world driving data and simulated environments, including gaming data, to train its models, ensuring a robust understanding of various driving scenarios [27][30]. - The company emphasizes the importance of using large-scale, diverse datasets for training, as opposed to relying solely on expert data, which may lack the complexity of real-world situations [28][30]. - NIO's approach to data collection and training is designed to enhance the vehicle's performance in edge cases and improve overall safety [41][44]. Group 4: Future Developments and Industry Position - NIO plans to introduce an open-set interaction system that allows for more natural communication between users and the vehicle, moving beyond limited command sets [18][20]. - The company is committed to continuous innovation and exploration in the field of autonomous driving, even if it means facing initial skepticism from the industry [8][25][39]. - NIO's advancements in autonomous driving technology are expected to position it ahead of competitors, particularly with the upcoming release of its open-set interaction capabilities [22][47].
任少卿的智驾非共识:世界模型、长时序智能体与 “变态” 工程主义
晚点LatePost· 2025-10-09 10:14
Core Viewpoint - The article emphasizes the challenging yet necessary path that NIO is taking in the field of intelligent driving, focusing on the development of world models and reinforcement learning to achieve advanced capabilities in autonomous driving [2][4][6]. Group 1: Company Background and Leadership - Ren Shaoqing, a prominent figure in NIO, has a strong academic background and significant contributions to deep learning, including the development of Faster R-CNN and ResNet [3][4]. - He co-founded the autonomous driving company Momenta before joining NIO, where he took on the challenge of building the second-generation platform from scratch [4][6]. Group 2: Technological Approach - NIO's approach to intelligent driving involves a combination of high computing power, multiple sensors, and a new architecture based on world models and reinforcement learning [5][6]. - The company aims to move beyond traditional end-to-end models, which are limited in their ability to handle long-term decision-making, by focusing on world models that integrate spatial and temporal understanding [8][11]. Group 3: World Model Concept - The world model is defined as a system that builds high-bandwidth cognitive capabilities based on video and images, addressing the limitations of language models in understanding complex real-world scenarios [11][14]. - NIO is the first company in China to propose the concept of world models, which includes understanding physical laws and the ability to predict movements in three-dimensional space over time [12][24]. Group 4: Reinforcement Learning Importance - The article highlights that the intelligent driving industry has yet to fully embrace the significance of reinforcement learning, which is crucial for developing long-term planning capabilities in autonomous systems [5][24]. - NIO recognizes that traditional imitation learning is insufficient for handling complex driving scenarios that require extended memory and decision-making [30][31]. Group 5: Data Systems and Training - NIO has developed a three-tier data system to ensure the quality and relevance of training data, emphasizing the importance of real-world data over expert data for training models [34][36]. - The company utilizes a combination of game data and real-world driving data to enhance the model's understanding of temporal dynamics and decision-making [25][26]. Group 6: Future Directions and Innovations - NIO plans to implement open-set instruction interaction, allowing users to communicate with the vehicle in a more natural and flexible manner, moving beyond limited command sets [16][18]. - The company is focused on continuous improvement and innovation, with plans to release new versions of their systems that enhance user interaction and safety features [19][20].
学术界和工业界都在如何研究端到端与VLA?三个月搞定端到端自动驾驶!
自动驾驶之心· 2025-10-09 04:00
Core Viewpoint - The article discusses the evolution and current state of end-to-end algorithms in autonomous driving, highlighting the emergence of various subfields, particularly those based on Visual Language Models (VLA) and the increasing interest in these technologies within both academia and industry [1][3]. Summary by Sections End-to-End Algorithms - End-to-end algorithms are central to the current mass production of autonomous driving technologies, involving a rich technology stack. There are primarily two paradigms: single-stage and two-stage. The single-stage approach, exemplified by UniAD, directly models vehicle trajectories from sensor inputs, while the two-stage approach outputs trajectories based on perception results [1]. VLA and Related Technologies - The development has progressed from modular production algorithms to end-to-end systems and now to VLA. Key technologies involved include BEV perception, Visual Language Models (VLM), diffusion models, reinforcement learning, and world models. The article emphasizes the importance of understanding these technologies to grasp the cutting-edge directions in both academia and industry [3]. Courses Offered - The article promotes two courses aimed at helping individuals quickly and efficiently learn about end-to-end and VLA in autonomous driving. The courses are designed for those new to large models and VLA, covering foundational theories and practical applications [3][10]. Course Content - The "VLA and Large Model Practical Course" focuses on VLA, starting from VLM as an interpreter for autonomous driving, and covers modular and integrated VLA, as well as mainstream inference-enhanced VLA. It includes detailed theoretical foundations and practical assignments to build VLA models and datasets from scratch [3][10]. Instructor Team - The courses are led by experienced instructors from both academia and industry, with backgrounds in multi-modal perception, autonomous driving VLA, and large model frameworks. They have published numerous papers in top conferences and have substantial practical experience in the field [7][9][10]. Target Audience - The courses are aimed at individuals with a foundational understanding of autonomous driving, familiar with basic modules, and possessing knowledge of transformer models, reinforcement learning, and BEV perception. A background in probability theory, linear algebra, and programming in Python and PyTorch is also recommended [13].
自动驾驶之心招募合伙人啦!4D标注/世界模型/模型部署等方向
自动驾驶之心· 2025-10-04 04:04
Group 1 - The article announces the recruitment of 10 outstanding partners for the autonomous driving sector, focusing on course development, paper guidance, and hardware research [2] - The main areas of expertise sought include large models, multimodal models, diffusion models, end-to-end systems, embodied interaction, joint prediction, SLAM, 3D object detection, world models, closed-loop simulation, and model deployment and quantization [3] - Candidates are preferred from universities ranked within the QS200, holding a master's degree or higher, with priority given to those with significant conference contributions [4] Group 2 - The compensation package includes resource sharing for job seeking, doctoral studies, and overseas study recommendations, along with substantial cash incentives and opportunities for entrepreneurial project collaboration [5] - Interested parties are encouraged to add WeChat for consultation, specifying "organization/company + autonomous driving cooperation inquiry" [6]
投注“端到端”:AI驶向物理世界,阿里云加速“闭环”
第一财经· 2025-09-27 12:39
Core Viewpoint - The rise of AI is leading to a new era characterized by embodied intelligence and intelligent assisted driving, marking the beginning of a competitive landscape in the Agentic AI era [1] Group 1: End-to-End Transformation - The transition from modular to end-to-end architectures in intelligent assisted driving is a significant shift, allowing for rapid iteration and adaptation to complex scenarios [3][4] - The end-to-end architecture requires exponential growth in data volume and computational power, with current mainstream intelligent driving companies needing 10P-30P of data for single model training [4] - The embodied intelligence sector faces even greater complexity, requiring machines to understand physical laws and execute intricate actions, which presents unique challenges in data handling and computational needs [4] Group 2: Cloud Infrastructure and AI Integration - Companies are increasingly seeking a unified cloud AI infrastructure that integrates computational power, big data, and AI platforms to support embodied intelligence [5] - Alibaba Cloud has upgraded its intelligent assisted driving solutions, achieving significant efficiency improvements, such as a 2-3 times increase in task management and scheduling capabilities [7] - The integration of NVIDIA's tools into Alibaba Cloud's AI platform aims to enhance the development of embodied intelligence applications, showcasing a comprehensive support system for data processing and model training [9] Group 3: High Demand and Future Outlook - The demand for high availability and extreme communication capabilities in embodied intelligence is pushing cloud providers to innovate beyond traditional models, leading to a new "network-storage-computation integration" requirement [10] - Alibaba Cloud is positioning itself as a leader in the AI infrastructure space, with ambitions to build a super AI cloud that can support the future needs of various industries [11][12] - The future may see a limited number of super cloud computing platforms, with Alibaba Cloud aiming to be a key player through substantial investments in AI infrastructure [11]
基于模仿学习的端到端决定了它的上限不可能超越人类
自动驾驶之心· 2025-09-24 06:35
Core Viewpoint - The article discusses the evolution of end-to-end (E2E) autonomous driving technology, emphasizing the transition from rule-based to data-driven approaches, and highlights the limitations of current models in handling complex scenarios. It introduces Visual Language Models (VLM) and Visual Language Agents (VLA) as potential solutions to enhance the capabilities of autonomous driving systems [2][3]. Summary by Sections Introduction to VLA - VLA represents a shift from merely imitating human behavior to understanding and interacting with the physical world, addressing the limitations of traditional E2E models in complex driving scenarios [2]. Challenges in Autonomous Driving - The VLA technology stack is still evolving, with numerous algorithms emerging, indicating a lack of convergence in the field [3]. Course Overview - A course titled "Autonomous Driving VLA and Large Model Practical Course" is being prepared to address various aspects of VLA, including its origins, algorithms, and practical applications [5]. Learning Objectives - The course aims to provide a comprehensive understanding of VLA, covering topics such as data set creation, model training, and performance enhancement [5][17]. Course Structure - The course is structured into several chapters, each focusing on different aspects of VLA, including algorithm introduction, foundational knowledge, VLM as an interpreter, modular and integrated VLA, reasoning enhancement, and practical assignments [20][26][31][34][36]. Instructor Background - The instructors have extensive experience in multimodal perception, autonomous driving, and large model frameworks, contributing to the course's credibility [38]. Expected Outcomes - Participants are expected to gain a thorough understanding of current advancements in VLA, master core algorithms, and be able to apply their knowledge in practical settings [39][40]. Course Schedule - The course is set to begin on October 20, with a structured timeline for each chapter's release [43].
自动驾驶VLA发展到哪个阶段了?现在还适合搞研究吗?
自动驾驶之心· 2025-09-22 08:04
Core Insights - The article discusses the transition in intelligent driving technology from rule-driven to data-driven approaches, highlighting the emergence of VLA (Vision-Language Action) as a more straightforward and effective method compared to traditional end-to-end systems [1][2] - The challenges in the current VLA technology stack are emphasized, including the complexity and fragmentation of knowledge, which makes it difficult for newcomers to enter the field [2][3] - A new practical course on VLA has been developed to address these challenges, providing a structured learning path for students interested in advanced knowledge in autonomous driving [3][4][5] Summary by Sections Introduction to VLA - The article introduces VLA as a significant advancement in autonomous driving, offering a cleaner approach than traditional end-to-end systems, while also addressing corner cases more effectively [1] Challenges in Learning VLA - The article outlines the difficulties faced by learners in navigating the complex and fragmented knowledge landscape of VLA, which includes a plethora of algorithms and a lack of high-quality documentation [2] Course Development - A new course titled "Autonomous Driving VLA Practical Course" has been created to provide a comprehensive overview of the VLA technology stack, aiming to facilitate easier entry into the field for students [3][4] Course Features - The course is designed to address key pain points, offering quick entry into the subject matter through accessible language and examples [3] - It aims to build a framework for understanding VLA research and enhance research capabilities by teaching students how to categorize papers and extract innovative points [4] - The course includes practical components to ensure that theoretical knowledge is effectively applied in real-world scenarios [5] Course Outline - The course covers various topics, including the origins of VLA, foundational algorithms, and the differences between modular and integrated VLA systems [6][15][19][20] - It also includes practical coding exercises and projects to reinforce learning and application of concepts [22][24][26] Instructor Background - The course is led by experienced instructors with a strong background in multi-modal perception, autonomous driving, and large model frameworks, ensuring high-quality education [27] Learning Outcomes - Upon completion, students are expected to have a thorough understanding of current advancements in VLA, core algorithms, and the ability to apply their knowledge in practical settings [28][29]
拟派发现金红利10.3亿!药明康德实施首次中期分红
Xin Lang Cai Jing· 2025-09-22 03:07
Core Viewpoint - WuXi AppTec (603259.SH/2359.HK) announced its first interim dividend plan, distributing a total cash dividend of 1.03 billion yuan, reflecting strong financial performance in the first half of the year [1] Financial Performance - For the first half of the year, WuXi AppTec achieved a revenue of 20.799 billion yuan, a year-on-year increase of 20.6% [1] - The net profit attributable to shareholders reached 8.287 billion yuan, up 95.5% year-on-year [1] - In Q2, the company reported revenue of 11.145 billion yuan, marking the first time it surpassed 10 billion yuan in a single quarter [1] Dividend Distribution - The total cash dividends distributed to investors this year, including annual, special, and interim dividends, amounted to 4.88 billion yuan [1] - The total cash dividends and share buybacks reached 6.88 billion yuan, accounting for over 70% of the company's projected net profit for 2024 [1] Order Backlog and Revenue Sources - As of June 2025, the company had a backlog of orders amounting to 56.69 billion yuan, a year-on-year growth of 37.2% [2] - Revenue from U.S. clients was 14.03 billion yuan, up 38.4% year-on-year, while revenue from European clients was 2.33 billion yuan, a 9.2% increase [2] Business Model and Growth Drivers - The growth is attributed to the focus on an "integrated, end-to-end" CRDMO business model, enhancing operational efficiency and expanding capabilities [4] - The sale of partial equity in the joint venture WuXi XDC Cayman Inc. is expected to yield an investment income of approximately 3.21 billion yuan [4] Future Projections - The company expects revenue growth for its ongoing business to return to double digits, with the growth rate adjusted from 10%-15% to 13%-17% [4] - Overall revenue projections for the year have been revised from 41.5-43 billion yuan to 42.5-43.5 billion yuan [4] Accounts Receivable Trends - Accounts receivable increased from 3.665 billion yuan in 2020 to 7.918 billion yuan in Q1 2025, with the proportion of accounts receivable to revenue rising from 15.18% in 2022 to 19.59% in 2023 [5]
开放几个自动驾驶技术交流群(世界模型/端到端/VLA)
自动驾驶之心· 2025-09-20 16:03
Group 1 - The establishment of a technical exchange group focused on autonomous driving technologies has been announced [1] - The group aims to facilitate discussions on various topics such as world models, end-to-end systems, and VLA [1] - The initiative coincides with the back-to-school season and autumn recruitment period, indicating a strategic timing for engagement [1]