世界模型
Search documents
蔡鑫莹:在数据浪潮与实像悬浮间构筑长沙创新高地 | 代表委员风采
Xin Lang Cai Jing· 2026-01-01 23:53
稿源:长沙晚报 2026-01-02 07:19 蔡鑫莹,市政协委员、湖南云畅网络科技有限公司董事长,市政协委员、湖南云畅网络科技有限公司董事长 长沙晚报全媒体记者 蒋志斌 在长沙奔腾不息的创新脉搏中,蔡鑫莹始终是一位独特的"双面"观察者与建设者。作为市政协委员与网络科技 公司的掌舵人,他一面深耕于数字经济的产业实践,感知技术最细微的脉动;一面立于参政议政的广阔平台, 为长沙建设全球研发中心城市建言献策。 蔡鑫莹的建言始终散发着浓厚的"未来感"与"落地性"。当人工智能的浪潮初显澎湃之势时,他的目光已越过喧 嚣,牢牢锁定其赖以成长的基石——数据要素。通过深入产业腹地的扎实调研,他率先系统提出:"长沙不仅要 参与人工智能的应用竞赛,更应抢占其'上游燃料'的供给端。我们丰富的应用场景、高素质的人才储备和已具 雏形的数据标注产业,正是打造高质量'数据燃料'基地的独特优势。"这一洞见并非停留在纸面,其核心思路与 省市后续聚焦数据要素与具身智能发展的产业规划高度契合,更在实践中推动了相关产业集聚区的萌芽与发 展。 蔡鑫莹的视野并不局限于单一技术赛道。在科技与文化的十字路口,他致力发现融合创新的璀璨光芒。今年, 他将思考锚 ...
践行者说|胡鲁辉:将世界模型注入身体,定义具身智能新生产力
机器人大讲堂· 2026-01-01 04:06
12月18-19日,第六届中国机器人行业年会在杭州举行。这场汇聚了超2000名行业专家企业家及从 业者的年度盛会,已成为解码机器人技术与商业未来的高端对话场。 机器人大讲堂特现推出系列深度报道,梳理大会现场行业顶尖专家与知名企业的核心洞见,探寻中国 机器人在具身智能时代的破局之路。 智澄AI是一家专注于通用人工智能与机器人技术的具身智能通用机器人前沿科技企业。公司汇聚了来自 Meta、微软、亚马逊、华为等全球名企和清华、CMU等顶尖名校的精英团队。 胡鲁辉首先以简明的脉络回顾了人工智能的加速演进。 "真正的人工智能其实也没有多久",从 2012 年的 AlexNet 到 AlphaGo ,再到基于 Transformer 的 ChatGPT , 迭代速度已从以年为单位缩短至以 季度 甚 至 月 计。他 认为 , AI 1.0 时代的核心是垂直应用,而当前我们正身处 AI 2.0 时代,其本质区别在于通用 和泛化。"我们公司这一次做的事情,就是怎么把通用人工智能从数字空间移到物理世界中来。" 本期聚焦 【胡鲁辉】世界模型激活具身智能 胡鲁辉 智澄 AI 创始人 & CEO 2 025 年,当具身智能的竞赛从技术 ...
为什么蔚来会押注世界模型?
自动驾驶之心· 2025-12-31 06:27
Core Insights - The article discusses the recent promotion of NIO's NWM 2.0, highlighting its positive reception and the potential of world models in intelligent driving [1] - It emphasizes that the true limit of intelligent driving lies in world models, which utilize video as a core component to understand spatiotemporal and physical laws, enabling machines to comprehend environments like humans do [1] Group 1: World Model Concept - World models address spatiotemporal cognition, while language models focus on conceptual cognition, with the former being more effective in modeling the real world's four-dimensional space-time [1] - The article mentions that many AI giants are developing general world models, including projects like Li Feifei's Marble, Yann LeCun's V-JEPA 2, and DeepMind's Genie 3 [1] Group 2: Challenges in Understanding World Models - The definition of world models remains vague, leading to confusion among newcomers in the field, who often spend significant time navigating challenges without clear guidance [1] - The article notes that understanding world models and completing tasks like data generation and closed-loop simulation can be particularly difficult for beginners [1] Group 3: Course Overview - A course is being offered to help individuals understand the world model domain in autonomous driving, featuring insights from industry algorithm experts [2][6] - The course will cover various aspects of world models, including their historical development, application cases, and different schools of thought within the field [6][10] Group 4: Course Structure - The course consists of six chapters, starting with an introduction to world models and their connection to end-to-end autonomous driving [6] - Subsequent chapters will delve into background knowledge, discussions on general world models, video generation-based models, OCC generation models, and industry applications [6][8][9][10] Group 5: Expected Outcomes - The course aims to equip participants with the skills to reach a level comparable to a world model autonomous driving algorithm engineer within a year [14] - Participants will gain a deeper understanding of key technologies such as BEV perception, multimodal large models, and generative models, enabling them to apply their knowledge in practical projects [14]
中国智能驾驶产业的算力巨变
3 6 Ke· 2025-12-30 10:36
Core Insights - In 2025, the Chinese smart driving industry is experiencing an unprecedented shift in computing power, driven by the evolution of software algorithms and the emergence of competing technical paradigms [1][2] - The differentiation in high-level intelligent driving commercial applications is evident, with a K-shaped market split between affordable and high-end models, leading to fragmentation in the industry [2] - The demand for computing power is increasingly recognized as a core element in the development of smart driving technologies, both at the vehicle and cloud levels [2] Group 1: Technological Evolution - The transition to an end-to-end framework in smart driving is marked by significant advancements, as seen in Tesla's FSD Beta V12 software, which utilizes a computing power standard of 144 TOPS [3][4] - Tesla's shift from HW3 to HW4 signifies a major milestone in its autonomous driving evolution, with the latter becoming the preferred platform for future software updates [5][6] - The upcoming FSD V14 version is expected to have ten times the parameters of its predecessor, indicating a substantial leap in the vehicle's ability to process complex environmental information [6] Group 2: Market Dynamics - Chinese smart driving players, including Xpeng, Li Auto, and NIO, are adopting end-to-end strategies but are initially relying on existing computing platforms, primarily NVIDIA's Orin-X [7][12] - By 2025, a clear division among smart driving companies has emerged, categorized into three main factions based on their computing power strategies: self-developed chips, NVIDIA-based solutions, and Huawei's offerings [12][13] - The self-developed chip faction includes NIO's NX9031 and Xpeng's Turing AI chip, while the NVIDIA faction is represented by the latest Thor platform, which is gaining traction in various models [13][14] Group 3: Cloud Computing and Future Prospects - The industry is witnessing a race for cloud computing power, which is essential for the evolution of smart driving algorithms and the transition from L2 to L4 capabilities [19][20] - The reliance on cloud computing is becoming increasingly critical, as it supports data processing, model training, and simulation necessary for addressing complex driving scenarios [23][24] - The ongoing competition for cloud resources is expected to intensify, with companies recognizing that enhanced cloud capabilities are vital for future advancements in autonomous driving technology [20][21]
中兴通讯崔丽:AI应用触及产业深水区,价值闭环走向完备
2 1 Shi Ji Jing Ji Bao Dao· 2025-12-30 10:25
Core Insights - The rapid development of AI large models is becoming a key factor in the new round of technological competition, with a belief that the number of foundational large models will converge to a single-digit figure, while numerous specialized models and applications will emerge across various industries [2] - Physical AI is highlighted as a significant area of focus, accelerating advancements in fields like embodied intelligence and autonomous driving, which are expected to profoundly change societal operations [2][3] - The transition from generative models to world models and visual language models (VLA) represents a paradigm shift in AI, moving from mere prediction to simulation and physical alignment [3][4] Industry Trends - The emergence of Sora has sparked discussions about world models, indicating a shift in AI capabilities from being mere predictors to becoming simulators [3] - The divergence in world model approaches has led to the classification of models into "generative" and "representational" camps, with each having distinct applications and strengths [4][5] - The integration of VLA and world models is seen as a trend, with VLA focusing on sequence modeling for robot control and world models emphasizing internal environmental modeling for efficient learning [5] Challenges and Solutions - Three major challenges remain for world models: understanding causality, building effective simulators, and addressing data scarcity issues [6] - The competition for high-quality synthetic data is crucial for the next phase of AI development, particularly in data-driven AI applications like autonomous driving [6] - The timeline for the realization of world models is projected to span from 2024-2025 for visual simulation to 2028-2030 for general embodied intelligence [6] Technological Evolution - The network architecture is evolving from "cloud-native" to "AI-native," necessitating a focus on performance and collaboration between computing and networking [7] - ZTE has been progressively advancing its hardware and software integration from 2G to 5G, now incorporating large models into its development paradigm [8] - The integration of AI into core business processes is expected to transform industries, with a shift from content generation to autonomous action [9] Implementation and Applications - ZTE's "Co-Sight Intelligent Agent Factory" aims to enhance reasoning capabilities and ensure decision-making reliability through advanced verification mechanisms [11][12] - The successful application of AI requires a combination of robust infrastructure, effective methodologies, and deep industry engagement [17] - Industries such as education, healthcare, software development, and smart manufacturing are identified as likely candidates for early AI value realization due to their structured data environments and feedback mechanisms [14][13] Future Directions - The hybrid approach of "cloud-edge collaboration" is recommended for integrating general foundational models with industry-specific enhancements [15] - The need for specialized models in non-natural language data scenarios is emphasized, particularly in high-stakes environments like finance [16] - The overarching narrative of AI is shifting towards practical applications in various sectors, moving away from mere technological showcases to tangible value creation [18]
正式开课!三个月搞懂自动驾驶世界模型技术栈
自动驾驶之心· 2025-12-30 09:20
Core Insights - The article discusses the vision of world models in understanding and transforming the physical world, emphasizing the role of continuous technological breakthroughs in generative AI for autonomous driving [2] - It highlights the ongoing exploration of world models in the autonomous driving sector, particularly in video generation and OCC generation [2][3] - The article addresses the challenges faced by newcomers in grasping the concept of world models and the complexities involved in data generation and closed-loop simulation [4][5] Summary by Sections Introduction to World Models - The first chapter provides an overview of world models and their connection to end-to-end autonomous driving, detailing the historical development and current applications [12] - It categorizes different types of world models, including purely simulated models and those that integrate planning and sensory input generation [12] Background Knowledge - The second chapter covers foundational knowledge related to world models, including scene representation and technologies like Transformer and BEV perception [13] - This chapter is crucial for understanding the technical vocabulary frequently encountered in job interviews related to world models [13] General World Model Discussion - The third chapter focuses on general world models and recent advancements in autonomous driving, discussing notable models such as Marble, Genie 3, and VLA+ algorithms [14] Video Generation-Based World Models - The fourth chapter delves into video generation algorithms, highlighting significant works like GAIA-1 & GAIA-2 and recent advancements in the field [15] OCC-Based World Models - The fifth chapter centers on OCC generation algorithms, discussing three major papers and a practical project that extends to vehicle trajectory planning [16] World Model Job Topics - The sixth chapter shares practical insights from industry experience, addressing the application of world models in the industry, common pain points, and interview preparation [17] Course Overview - The course aims to provide a comprehensive understanding of end-to-end autonomous driving, with a focus on world models, and is designed for individuals looking to enter the autonomous driving industry [17][20] - It includes detailed discussions on key technologies and methodologies, ensuring participants can apply their knowledge in real-world projects [20] Course Schedule - The course is set to begin on January 1, with a duration of approximately two and a half months, featuring offline video lectures and online Q&A sessions [21][22]
为什么世界模型对行业产生了这么大的影响?
自动驾驶之心· 2025-12-29 09:17
Core Insights - The article emphasizes the vision of world models in understanding and transforming the physical world, focusing on the continuous technological breakthroughs that lead to generative AI in autonomous driving [2] Group 1: World Model Exploration - Various companies are building their cloud and vehicle-based world models using open-source algorithms for long-tail data generation and closed-loop simulation/evaluation [4] - The exploration of world models in autonomous driving includes video generation, OCC generation, and LiDAR point cloud generation, with notable works from Wayve, OccWorld, and others [3][4] Group 2: Challenges in Understanding World Models - The definition of world models remains ambiguous, leading to confusion among newcomers in the field [5] - Many beginners struggle to grasp the concepts of data generation and closed-loop simulation, often feeling lost despite extensive efforts [6] Group 3: Course Offering - The article introduces a course on world models in autonomous driving, developed in collaboration with industry algorithm experts, aimed at helping learners understand the field from theory to practice [6][8] - The course covers various chapters, including an introduction to world models, background knowledge, discussions on general world models, and practical applications in video and OCC generation [11][12][13][14] Group 4: Course Structure and Content - The course is structured into six chapters, each focusing on different aspects of world models, including their historical development, technical stacks, and industry applications [11][12][13][14][15] - The course aims to equip participants with the necessary skills to understand and implement world models in autonomous driving, preparing them for job interviews and practical applications [16][19]
传媒行业点评:头部厂商持续入局世界模型,关注影视、游戏环节应用潜力
China Post Securities· 2025-12-29 08:44
Industry Investment Rating - The industry investment rating is "Outperform the Market" and is maintained [1] Core Insights - The report highlights the continuous entry of leading companies into the world model space, with a focus on the film and gaming sectors [3] - The world model is identified as a significant direction in AGI research, with major companies actively investing in this area [4] - The capabilities of world models are expected to evolve, providing ongoing empowerment to the film and gaming industries [5] Summary by Relevant Sections Industry Overview - The closing index level is 802.63, with a 52-week high of 897.3 and a low of 590.32 [1] Investment Highlights - Major companies like Google, Runway, and ByteDance are developing world models that simulate real-world environments and generate content based on multimodal inputs [4] - Google’s latest model, Genie 3, can generate dynamic worlds based on text prompts, while Runway has released GWM-1, which includes variants for environment exploration, character dialogue, and robotics [4] - ByteDance has established a team focused on multimodal interaction and world models, recently launching the 3D generation model Seed3D 1.0 [4] Future Potential - In the film sector, world models aim to enhance video generation by creating physically accurate virtual environments, which could lead to advancements in long video production and complex storytelling [5] - In gaming, the three-dimensional world generation and interactivity of world models align well with game development processes, with companies like Tencent and xAI exploring these capabilities [5] Investment Recommendations - Companies to watch include Kunlun Wanwei for world model development, Huace Film & TV, Light Media, and Hengdian Film for AI in film production, and Perfect World and Giant Network for large-scale 3D game development [6]
世界模型和数字孪生的本质是什么?怎么赋能自动驾驶?
自动驾驶之心· 2025-12-29 01:07
Core Viewpoint - The article discusses the essence of world models and digital twins in the context of autonomous driving, emphasizing their role in training perception models in virtual environments and applying them to real-world scenarios [5][6]. Group 1: World Models - World models are defined as the ultimate goal of modeling the physical world, focusing on "spatiotemporal cognition" and requiring vast amounts of video data for training [7]. - The development of world models is shifting from simple visual dynamics simulation to creating immersive interactive environments that reflect real-world complexities [8]. - The core consensus among researchers is that the primary purpose of world models is to understand dynamic environments and predict future scenarios [7][9]. Group 2: Applications in Autonomous Driving - In autonomous driving, world models must provide real-time perception of road conditions and accurately predict their evolution, focusing on immediate environmental awareness and complex trend forecasting [11]. - Key features of effective world models include physical consistency, multiscale spatiotemporal modeling, causal reasoning capabilities, and the ability to generate interactive environments [11]. - Various companies are implementing world models, such as NIO's NWM world model for simulation training, Xiaomi's ORION framework for integrating simulation tools, and Wayve's GAIA-1 for generative world modeling [17]. Group 3: Digital Twins - Digital twins are defined as virtual representations of physical systems that allow for low-cost, high-efficiency research on key technologies and solutions in autonomous driving [19]. - The role of digital twins extends beyond mere observation; they participate in iterative processes to enhance real-world applications [19]. - Digital twins facilitate the modeling of physical world elements in virtual spaces, enabling further work on perception models and system iterations [20][21]. Group 4: Related Technologies - Technologies such as 3D occupancy grids and point clouds are utilized to predict spatial occupancy and enhance scene understanding in autonomous driving [22]. - The integration of multimodal inputs, including visual and LiDAR data, is crucial for improving depth estimation and overall perception accuracy [92]. - The article highlights the importance of self-supervised learning techniques in enhancing the efficiency of 3D scene reconstruction and semantic labeling in autonomous driving applications [90][91].
哼哧哼哧搞了小半年,小结一下这段时间世界模型的学习成果
自动驾驶之心· 2025-12-27 02:07
本文只做学术分享,如有侵权,联系删文 哼哧哼哧搞了小半年,小结一下这段时间的学习成果。 什么是世界模型? 值得注意的是,世界模型不是一个具体的模型或者范式。实际上有好几个不同方向的都管自己叫世界模型。差不多是各说各的,因此大家在阅读文章时需要仔细辨 析。 World model 的流行要归功于Jurgen2018年的world .其对world model的定义是" a mental model of the world", 即世界在大脑中的映射。更具体一点是 作者 | cloud erow 编辑 | 自动驾驶之心 原文链接: https://zhuanlan.zhihu.com/p/1943329007706805619 点击下方 卡片 ,关注" 自动驾驶之心 "公众号 戳我-> 领取 自动驾驶近30个 方向 学习 路线 >>自动驾驶前沿信息获取 → 自动驾驶之心知识星球 The image of the world around us, which we carry in our head, is just a model. Nobody in his head imagines all the worl ...