世界模型
Search documents
为什么世界模型对行业产生了这么大的影响?
自动驾驶之心· 2025-12-29 09:17
点击下方 卡片 ,关注" 自动驾驶之心 "公众号 戳我-> 领取 自动驾驶近30个 方向 学习 路线 世界模型的愿景是理解并改变物理世界,核心在于以持续技术突破引领生成式AI自动驾驶范式,重塑自动驾驶底层能力。 2025年6月yann lecun发布V-JEPA 2,2025年8月DeepMind发布Genie 3,2025年11月李飞飞发布Marble。 而在自动驾驶领域,关于世界模型的探索也一直没有停止。最常见的方向是 视频生成 ,也是学术界和工业界探索最多的领域,像wayve的GAIA-1/2/3,上交 CVPR'25的工作UniScene等等。其次是 OCC生成 ,比较经典的有OccWorld、OccLLaMA,还有租金西交最新的SOTA工作II-World。还有一个领域是做 Lidar点云生 成 ,或者视觉和点云的联合生成,比如LiDARGen、LiDARCrafter等等。 不少公司基于这些开源算法搭建自己的云端/车端世界模型,用于长尾数据生成或者闭环仿真/评测。一些公司也在尝试基于世界模型直接赋能车端驾驶能力。 但世界模型的定义仍然很模糊,生成 = 世界模型? 生成 + 重建 = 世界模型。 对 ...
传媒行业点评:头部厂商持续入局世界模型,关注影视、游戏环节应用潜力
China Post Securities· 2025-12-29 08:44
强于大市|维持 | 行业基本情况 | | | | --- | --- | --- | | 收盘点位 | | 802.63 | | 52 | 周最高 | 897.3 | | 52 | 周最低 | 590.32 | 行业相对指数表现(相对值) 研究所 证券研究报告:传媒|点评报告 行业投资评级 分析师:王晓萱 SAC 登记编号:S1340522080005 Email:wangxiaoxuan@cnpsec.com 传媒行业点评 头部厂商持续入局世界模型,关注影视、游戏环节 应用潜力 ⚫ 事件回顾 2025 年 12 月 17 日,根据 IT 之家报道,腾讯正式发布旗下混元 世界模型 1.5。混元世界模型 1.5 首次开源了业界最系统、最全面的实 时世界模型框架,涵盖数据、训练、流式推理部署等全链路、全环节, 并提出了重构记忆力、长上下文蒸馏、基于 3D 的自回归扩散模型强 化学习等算法模块。 ⚫ 投资要点 世界模型是目前 AGI 研究重要方向,各大厂商均在积极布局。 世界模型是一类能够对现实世界环境进行仿真,并基于文本、图像、 视频及运动等多模态输入生成视频、预测未来状态的生成式人工智能 模型。目前 Googl ...
世界模型和数字孪生的本质是什么?怎么赋能自动驾驶?
自动驾驶之心· 2025-12-29 01:07
Core Viewpoint - The article discusses the essence of world models and digital twins in the context of autonomous driving, emphasizing their role in training perception models in virtual environments and applying them to real-world scenarios [5][6]. Group 1: World Models - World models are defined as the ultimate goal of modeling the physical world, focusing on "spatiotemporal cognition" and requiring vast amounts of video data for training [7]. - The development of world models is shifting from simple visual dynamics simulation to creating immersive interactive environments that reflect real-world complexities [8]. - The core consensus among researchers is that the primary purpose of world models is to understand dynamic environments and predict future scenarios [7][9]. Group 2: Applications in Autonomous Driving - In autonomous driving, world models must provide real-time perception of road conditions and accurately predict their evolution, focusing on immediate environmental awareness and complex trend forecasting [11]. - Key features of effective world models include physical consistency, multiscale spatiotemporal modeling, causal reasoning capabilities, and the ability to generate interactive environments [11]. - Various companies are implementing world models, such as NIO's NWM world model for simulation training, Xiaomi's ORION framework for integrating simulation tools, and Wayve's GAIA-1 for generative world modeling [17]. Group 3: Digital Twins - Digital twins are defined as virtual representations of physical systems that allow for low-cost, high-efficiency research on key technologies and solutions in autonomous driving [19]. - The role of digital twins extends beyond mere observation; they participate in iterative processes to enhance real-world applications [19]. - Digital twins facilitate the modeling of physical world elements in virtual spaces, enabling further work on perception models and system iterations [20][21]. Group 4: Related Technologies - Technologies such as 3D occupancy grids and point clouds are utilized to predict spatial occupancy and enhance scene understanding in autonomous driving [22]. - The integration of multimodal inputs, including visual and LiDAR data, is crucial for improving depth estimation and overall perception accuracy [92]. - The article highlights the importance of self-supervised learning techniques in enhancing the efficiency of 3D scene reconstruction and semantic labeling in autonomous driving applications [90][91].
哼哧哼哧搞了小半年,小结一下这段时间世界模型的学习成果
自动驾驶之心· 2025-12-27 02:07
本文只做学术分享,如有侵权,联系删文 哼哧哼哧搞了小半年,小结一下这段时间的学习成果。 什么是世界模型? 值得注意的是,世界模型不是一个具体的模型或者范式。实际上有好几个不同方向的都管自己叫世界模型。差不多是各说各的,因此大家在阅读文章时需要仔细辨 析。 World model 的流行要归功于Jurgen2018年的world .其对world model的定义是" a mental model of the world", 即世界在大脑中的映射。更具体一点是 作者 | cloud erow 编辑 | 自动驾驶之心 原文链接: https://zhuanlan.zhihu.com/p/1943329007706805619 点击下方 卡片 ,关注" 自动驾驶之心 "公众号 戳我-> 领取 自动驾驶近30个 方向 学习 路线 >>自动驾驶前沿信息获取 → 自动驾驶之心知识星球 The image of the world around us, which we carry in our head, is just a model. Nobody in his head imagines all the worl ...
智驾L3冲刺,车企都在赌哪条路
汽车商业评论· 2025-12-26 23:04
Core Insights - The article emphasizes the transition from L2 to L3 level autonomous driving, highlighting the importance of commercializing L3 by 2026, which represents a significant shift in responsibility from drivers to vehicle systems [5][37] - The concept of "intelligent driving equity" is gaining traction, with more affordable models incorporating advanced driver-assistance systems (ADAS) [14][15] - The evaluation of intelligent driving technologies is evolving, focusing on user experience and safety rather than merely ranking performance [9][24] Group 1: Industry Trends - The number of vehicles equipped with highway Navigation on Autopilot (NOA) has increased from 18 in 2024 to 29 in 2025, a growth of over 50%, with entry-level prices dropping below 100,000 yuan [15][16] - Urban NOA functionality has expanded from 10 to 24 models, marking a 150% increase, with entry-level models now available around 150,000 yuan [15][16] - The average takeover mileage (MPI) for intelligent driving has improved from 6.4 km to 12.1 km, indicating a nearly 100% increase in system reliability [17][19] Group 2: Evaluation Methodology - The evaluation framework for ADAS is based on Maslow's hierarchy of needs, prioritizing system performance, user comfort, and efficiency [24][26] - The assessment includes both basic and challenging driving scenarios, with 80% of the evaluation focused on common driving conditions and 20% on complex situations [27][28] - The testing route covered approximately 40 km, incorporating various driving challenges, including construction zones and parking scenarios, to assess the systems comprehensively [27][28] Group 3: Key Findings and Innovations - Leading brands such as Li Auto, Weipai, and NIO have demonstrated significant advancements in their ADAS capabilities, achieving an average of nearly 20 km before requiring driver intervention [29][31] - Li Auto's VLA (Visual Language Behavior Model) has introduced innovative features, such as understanding natural language commands for parking, enhancing user interaction with the system [33][40] - The article highlights the importance of clear communication regarding system capabilities to users, suggesting that understanding what the system can and cannot do is crucial for future iterations [10][39] Group 4: Future Directions - The industry is moving towards a hybrid approach that combines end-to-end learning with rule-based systems to enhance understanding and responsiveness in complex driving scenarios [40][42] - The debate over the reliance on high-definition maps is shifting towards a more balanced approach, emphasizing the importance of situational awareness and adaptability in driving systems [44][45] - The article notes that the introduction of stricter regulations for ADAS is expected to impact the market, pushing for safer and more reliable systems [37][39]
赵何娟对话王维嘉:AI没有系统性泡沫,原生AI应用将在三年内爆发 | 巴伦精选
Xin Lang Cai Jing· 2025-12-26 13:54
来源:钛媒体 12月20日,在钛媒体2025 T-EDGE全球对话中,钛媒体集团创始人、Barron's中国出版人「赵何娟 Talk」(Jany Talk)与硅谷资深投资人、企业家王维嘉先 生展开了一场深度对话。 两年前,ChatGPT风靡全球时,我们曾与王维嘉深入探讨AI的未来。两年后的今天,当Google Gemini 3掀起新一轮技术竞赛、华尔街开始质疑AI泡沫、扎克 伯格开出天价年薪抢人时,我们再次坐下来,拨开喧嚣,回答那些真正重要的问题: 模型竞争的终局是什么?哪些应用会率先落地?人类与机器的边界在哪里?未来一到三年,什么才是真正值得关注的变化? 以下为本次对话核心观点摘录: 1、OpenAI不会轻易出局,未来是交替领先的动态格局。只要各家公司使用相同的Transformer架构和技术路径,差距就不会是不可逾越的,未来将是"你六 个月超越我,我再六个月超越你"的持续迭代,不会突然出现某一家遥遥领先、无人可及的局面。 2、当前对英伟达的主要挑战在于,各大科技公司纷纷开始自研AI芯片,如果未来每家公司都能开发出成本更低、效率更高、易用性更好的芯片,其将面临 被替代的风险。未来云服务市场越集中,对其越不利 ...
收到很多同学关于自驾方向选择的咨询......
自动驾驶之心· 2025-12-26 09:18
对于从事自动化和计算机的同学,建议搞深度学习,VLA、端到端、世界模型都是很好的方向,从入门、到 工作甚至读博都有很大空间。对于机械和车辆的同学,可以先学习传统PnC、3DGS这些方向算力低、入手简 单。 剩下的就是一些方法论的提升了,多看论文多交流,慢慢形成自己的思考和idea。 对很多新人研究者,一个 好的idea需要踩很多次坑。如果你还是新人,不知道怎么入门,可以看看我们推出的论文辅导。 论文辅导上线了! 端到端、VLA、世界模型、强化学习、3D目标检测、多传感器融合、3DGS、BEV感知、Occupancy Network、多任务学习、语义分割、轨迹预测、运动规划、扩散模型、Flow matching、点云感知、毫米波雷 达、单目感知、车道线/在线高精地图等方向。 如果您有任意论文发表需求,支持带课题/研究方向咨询,欢迎联系我们, 微信:paperguidance 提供的服务 论文选题; 论文全流程指导; 实验指导; 点击下方 卡片 ,关注" 自动驾驶之心 "公众号 戳我-> 领取 自动驾驶近30个 方向 学习 路线 最近收到不少同学的咨询,很多都是计算机、车辆、自动化和机械方向的同学。 先看自驾一些 ...
蒸馏、GEO、氛围编程 2025年度“AI十大黑话” 能听懂几个?
3 6 Ke· 2025-12-26 09:16
Core Insights - The article discusses the rapid development of AI in 2025, highlighting ten key terms that reflect how AI is reshaping industries and society. Group 1: AI Concepts - Vibe Coding redefines programming by allowing developers to express goals in natural language, with AI generating the necessary code [2] - Reasoning models have emerged as a core focus in AI discussions, enabling complex problem-solving through multi-step reasoning [3] - World Models aim to enhance AI's understanding of real-world causality and physical laws, moving beyond mere language processing [4] Group 2: Infrastructure and Investment - The demand for AI has led to the construction of super data centers, exemplified by OpenAI's $500 billion "Stargate" project, raising concerns about energy consumption and local impacts [5] - The AI sector is experiencing a capital influx, with companies like OpenAI and Anthropic seeing rising valuations, though many are still in the high-investment phase without stable profit models [6] Group 3: AI Challenges and Trends - The term "intelligent agents" is popular in AI marketing, but there is no consensus on what constitutes true intelligent behavior [7] - Distillation technology allows smaller models to learn from larger ones, achieving high performance at lower costs [8] - The concept of "AI garbage" reflects public concern over the quality and authenticity of AI-generated content [9] Group 4: AI in Real-World Applications - Physical intelligence remains a significant challenge for AI, as robots still require human intervention for complex tasks [10] - The shift from traditional SEO to Generative Engine Optimization (GEO) indicates a change in how brands and content creators engage with AI-driven information retrieval [11]
2025,AI圈都在聊什么?年度十大AI热词公布
3 6 Ke· 2025-12-26 07:33
Core Insights - The development of AI in 2025 is marked by emerging concepts that are reshaping the industry landscape, as highlighted by the "MIT Technology Review" which identifies the top ten AI buzzwords of the year [1] Group 1: Emerging Concepts in AI - Vibe Coding redefines programming by allowing developers to express goals and logic in natural language, with AI generating the corresponding code [2] - Reasoning models have gained prominence, enabling AI to tackle complex problems through multi-step reasoning, with major advancements from OpenAI and DeepSeek [3] - World models aim to enhance AI's understanding of real-world causal relationships and physical laws, moving beyond mere language processing [4] Group 2: Infrastructure and Economic Implications - The demand for AI has led to the construction of super data centers, exemplified by OpenAI's $500 billion "Stargate" project, raising concerns about energy consumption and local community impacts [5] - The AI sector is experiencing a capital influx, with companies like OpenAI and Anthropic seeing rising valuations, although many are still in the high-investment phase without stable profit models [6] Group 3: Quality and Standards in AI - The term "intelligent agents" is widely used in AI marketing, but there is no consensus on what constitutes true intelligent behavior, highlighting a lack of industry standards [7] - Distillation technology allows smaller models to learn from larger ones, achieving high performance at lower costs, indicating that effective algorithms can drive AI advancements [8] Group 4: Content Quality and User Interaction - "AI garbage" refers to low-quality AI-generated content, reflecting public concerns about the authenticity and quality of information in the AI era [9] - Physical intelligence remains a challenge for AI, as robots still require human intervention for complex tasks, indicating a long road ahead for AI to fully understand and adapt to the physical world [10] - The shift from traditional SEO to Generative Engine Optimization (GEO) signifies a change in how brands and content creators engage with AI, emphasizing the importance of being referenced by AI in responses [11]
AI“世界模型”来了
财联社· 2025-12-26 03:15
Core Viewpoint - The emergence of AI models capable of generating interactive 3D environments is set to disrupt the global video game industry, potentially reshaping a market valued at tens of billions of dollars [3][4]. Group 1: AI Impact on Gaming - Leading AI teams, including Google DeepMind and World Labs, believe that "world models" will significantly transform the gaming industry [4]. - World Labs, co-founded by AI pioneer Fei-Fei Li, launched its first commercial product, Marble, which allows users to create coherent, high-fidelity 3D worlds from a single image, video, or text prompt [5]. - The technology is expected to disrupt existing game engines like Unity and Unreal, with experts predicting a fundamental change in software and game development in the coming years [8]. Group 2: Industry Growth and AI Integration - According to Newzoo, the global gaming industry is projected to generate nearly $190 billion in revenue this year, with generative AI tools already being utilized for creating visual assets in games [9]. - AI has reportedly increased the development speed of games, with Game Gears' CEO stating that their game development pace has quadrupled due to AI [9]. - The integration of AI in gaming is exemplified by Epic Games' collaboration with Disney to introduce an AI-driven character in Fortnite, showcasing the potential for interactive non-player characters [10]. Group 3: Future of Game Development - Experts predict that players will soon be able to create entirely new game worlds, reducing reliance on expensive software and specialized skills [13]. - The ability to create highly personalized games is becoming simpler, which could lead to a significant transformation in the gaming industry [14]. - While some critics express concerns about AI leading to job displacement and low-quality content, optimists believe AI can lower costs, enhance creativity, and alleviate developer burnout in a high-cost industry where top games often exceed $1 billion in development costs [15].