时空认知
Search documents
蔚来任少卿:世界模型解决的是时空认知,VLA做不到。
自动驾驶之心· 2025-10-09 23:32
Core Viewpoint - The article discusses the importance of world models in intelligent driving, emphasizing that true understanding of the environment requires a high-bandwidth cognitive system that goes beyond language models [2][3][5]. Summary by Sections World Model vs. Language Model - The world model focuses on spatiotemporal cognition, while the language model addresses conceptual cognition. Language models have low bandwidth and sparsity, making them ineffective for modeling the real world's four-dimensional space-time [2][3]. - The world model aims to establish capabilities directly at the video level rather than converting information into language first [3][5]. VLA and WA - VLA (Vision-Language Architecture) is essentially an extension of language models, adding new modalities but still rooted in language. In contrast, the world model is not merely an addition of language but a comprehensive cognitive system [3][5]. - The ultimate goal of autonomous driving is to achieve open-set interactions, allowing users to express commands freely without being limited to a fixed set of instructions [3][4]. Importance of Language - Language remains crucial for three main reasons: 1. Incorporation of physical laws such as gravity and inertia into the model [6]. 2. Ability to understand and predict object movements in three-dimensional space over time [6]. 3. The vast amount of data absorbed by language models from the internet aids in training autonomous driving systems [7]. Industry Trends - The autonomous driving industry is experiencing intense competition, with many professionals considering transitioning to other fields. The ongoing debate between VLA and WA represents a larger industry transformation [9]. - The article suggests that those who remain in the industry must be versatile talents with rich technical backgrounds, as the market is expected to undergo significant changes [9]. Community and Learning Resources - A community platform has been established to provide resources for learning and sharing knowledge about autonomous driving, including video tutorials, technical discussions, and job opportunities [11][12][24]. - The community aims to gather individuals from various academic and industrial backgrounds to foster collaboration and knowledge sharing [25].
任少卿的智驾非共识:世界模型、长时序智能体与 “变态” 工程主义
晚点LatePost· 2025-10-09 10:14
Core Viewpoint - The article emphasizes the challenging yet necessary path that NIO is taking in the field of intelligent driving, focusing on the development of world models and reinforcement learning to achieve advanced capabilities in autonomous driving [2][4][6]. Group 1: Company Background and Leadership - Ren Shaoqing, a prominent figure in NIO, has a strong academic background and significant contributions to deep learning, including the development of Faster R-CNN and ResNet [3][4]. - He co-founded the autonomous driving company Momenta before joining NIO, where he took on the challenge of building the second-generation platform from scratch [4][6]. Group 2: Technological Approach - NIO's approach to intelligent driving involves a combination of high computing power, multiple sensors, and a new architecture based on world models and reinforcement learning [5][6]. - The company aims to move beyond traditional end-to-end models, which are limited in their ability to handle long-term decision-making, by focusing on world models that integrate spatial and temporal understanding [8][11]. Group 3: World Model Concept - The world model is defined as a system that builds high-bandwidth cognitive capabilities based on video and images, addressing the limitations of language models in understanding complex real-world scenarios [11][14]. - NIO is the first company in China to propose the concept of world models, which includes understanding physical laws and the ability to predict movements in three-dimensional space over time [12][24]. Group 4: Reinforcement Learning Importance - The article highlights that the intelligent driving industry has yet to fully embrace the significance of reinforcement learning, which is crucial for developing long-term planning capabilities in autonomous systems [5][24]. - NIO recognizes that traditional imitation learning is insufficient for handling complex driving scenarios that require extended memory and decision-making [30][31]. Group 5: Data Systems and Training - NIO has developed a three-tier data system to ensure the quality and relevance of training data, emphasizing the importance of real-world data over expert data for training models [34][36]. - The company utilizes a combination of game data and real-world driving data to enhance the model's understanding of temporal dynamics and decision-making [25][26]. Group 6: Future Directions and Innovations - NIO plans to implement open-set instruction interaction, allowing users to communicate with the vehicle in a more natural and flexible manner, moving beyond limited command sets [16][18]. - The company is focused on continuous improvement and innovation, with plans to release new versions of their systems that enhance user interaction and safety features [19][20].