世界模型
Search documents
2026 将近,世界模型到底更「世界」了吗?
机器之心· 2025-12-13 02:30
Core Viewpoint - The recent launch of GWM Worlds and GWM Robotics by Runway pushes video generation towards an interactive "world simulation" paradigm, reigniting discussions on the definition and scope of "world models" as interfaces for creation and interaction, simulators for training and evaluation, or cognitive frameworks for reasoning and decision-making [1]. Group 1: Evolution of World Models - Over the past two years, world models have evolved to be considered on par with LLMs in the AGI landscape, transitioning from a narrow definition focused on reinforcement learning to a broader understanding that includes generative modeling [4]. - Initially, world models were seen as internal environment models for agents, predicting future states based on current conditions and actions, allowing for internal simulation and decision-making [5]. - The engineering perspective defined world models as a combination of three capabilities: compressing high-dimensional perception into usable representations, predicting future states over time, and utilizing predictions for planning and decision-making [6]. - By 2024, the understanding of world models expanded to encompass general world evolution modeling, with a trend from language generation to image generation, and ultimately to 3D and world generation [6]. - The boundaries of the world model concept have become more ambiguous, with ongoing debates about the nature of representations, the incorporation of physical laws, and the organization of input relationships [6]. Group 2: Industry Layout and Trends - Major companies are investing in world models, questioning whether they are enhancing their "data engines" or building new frameworks for "spatiotemporal cognition" [3]. - In February 2024, OpenAI referred to the video generation model Sora as "world simulators," emphasizing their ability to learn the three-dimensional structure and physical laws of the real world [6]. - Concurrently, LeCun introduced V-JEPA, which focuses on predicting masked video segments in abstract representation space, allowing for higher training efficiency by discarding unpredictable information [6]. - The current discourse has shifted from whether to develop world models to how to model them, with debates on whether to abstract from pixel levels or to directly operate in abstract spaces [7]. - There is a recognition that existing approaches may only capture partial physical laws, indicating a need for representations of isolated objects and a priori laws of change across space and time to achieve a coherent world model [7]. Group 3: Definition and Ambiguity of World Models - By 2025, world models are positioned alongside LLMs, with companies like Google DeepMind, Meta, and Nvidia shifting focus from pure LLMs to world models, aiming for "Physical AI + superintelligence" due to stagnation in LLM advancements [8]. - The distinction between world models and existing generative AI lies in the former's goal to construct internal representations of environments that include physical, temporal, and spatial dimensions for planning and decision-making [9]. - The term "world model" has become ambiguous, referring to latent states within systems, game-like simulators for training agents, or any content pipeline capable of generating navigable 3D scenes [9]. - An analysis from Entropy Town in November 2025 categorized world models into three technical routes: interface, simulator, and cognitive framework, highlighting the ongoing ambiguity in the field [9].
GAIR 2025 正式开幕:当AI变革行至产业深海,我们又将如何破暗寻光?
雷峰网· 2025-12-12 02:49
" 在模型与算力的潮汐中,智能星火正在汇成产业巨浪,且看AI如 何重构产业生态的万千图景。 " 作者丨徐晓飞 编辑丨包永刚 12月12日的深圳,和世界万千城市一同蛰伏于智能产业爆发的黎明前夜,而一场汇聚前沿洞见的思想盛 会,正在此破土而出。 站在大模型技术深入"产业变革"的关键节点, 第八届 GAIR 全球人工智能与机器人大会 ,正式在深圳博 林天瑞喜来登酒店举办。 大会共开设四个主题论坛与两个闭门会议,聚焦 大模型、AI算力、世界模型、 数据&一脑多形、AI 硬件 等领域的创新脉搏。 这是GAIR大会走过的第八载,也是中国AI产学研投专家群体,对当前科技变革的又一次思想共振与方向校 准。 古有探骊得珠,需持炬而入深海,方可见骊龙颔下之至宝。 对眼下的AI大模型产业变革来说,亦是如此。 要知道,如今的AI大模型浪潮,已从几年前的"技术破壁"迈入了"价值深耕"阶段,愈发如深海骊龙的颔下 之宝,浮于浅水者必不可得。 而始于2016年的GAIR大会便如这枚探海之炬,八载深耕,薪火相传,汇聚前瞻学者与行业先锋的顶尖思 想,既照见了全球 AI 从业者的筚路蓝缕,也照彻了智能纪元从萌芽到勃发的浩荡征程。 GAIR大会至今 ...
商汤AI论坛探索未来智能范式,视觉AI迈入二次增长曲线
Zheng Quan Shi Bao Wang· 2025-12-11 14:03
Group 1 - The "2025 SenseTime Technology AI Forum" was successfully held, focusing on key topics such as breakthroughs in multimodal large models, embodied intelligence, and industrial intelligence upgrades [1] - SenseTime's CEO Xu Li emphasized that the past decade has seen rapid changes in AI cognition, marking a significant technological wave that is reshaping work across industries [1] - SenseTime aims to leverage Hong Kong's favorable innovation and technology environment to connect national AI strategies with global innovation networks, serving both local and international markets [1] Group 2 - SenseTime's co-founder and Chief Scientist Lin Dahua highlighted challenges in AI industrialization, including reliability, professional data, spatial understanding, and cost [1] - SenseTime is innovating through foundational technologies such as native multimodal fusion architecture and high-efficiency reasoning systems to enhance spatial cognition and real-time interaction capabilities [1] - The forum also discussed how AI can drive deep paradigm shifts in enterprises, with SenseTime's Asia-Pacific business serving nearly 500 clients, 70% of whom maintain long-term partnerships [2] Group 3 - Wang Xiaogang, co-founder of SenseTime and Chairman of Daxiao Robotics, announced the upcoming launch of Daxiao Robotics on December 18, introducing leading technologies and the first domestic open-source "KAIWU" world model 3.0 [2] - The forum emphasized the integration of "model-hardware-scene" ecosystems to promote breakthroughs in embodied intelligence across various applications, including industrial manufacturing and home companionship [2] - SenseTime's Visual AI 2.0, empowered by large language models, transforms real-time video analysis into actionable solutions, marking a new growth phase for visual AI [2]
倒反天罡,Meta抄阿里千问作业,没拿授权
3 6 Ke· 2025-12-11 11:51
据悉,阿里千问为开源模型,所有人都可以自主下载,无需授权。根据官方披露,截至目前,千问模型 的全球累计下载量已突破7亿次。 Meta引入阿里千问模型,曾是开源霸主 今年以来,由于Llama 4发布后表现拉胯,Meta与OpenAI、谷歌等竞争对手的差距逐渐拉大,在此背景 下,牛油果项目应运而生。 作为Meta下一代旗舰级AI大模型,"牛油果"大模型被视为Meta在AI军备竞赛中的"救命稻草",目标性能 直指GPT-5,计划于2026年第一季度发布。 当时就有海外网友恶搞做了张梗图:牛油果的芯是鲸鱼(DeepSeek)。 只不过最终牛油果切开不是DeepSeek,而是千问。 12月10日,据报道,Meta在新一代大模型"牛油果"的研发中,引入了阿里巴巴通义千问模型,来对新模 型进行微调优化。 科技每日推送从阿里云独家获悉,Meta事先没有找阿里索要授权,阿里昨晚也是刚知道。 要知道前两年,Meta的Llama模型在全球开源界是绝对的霸主地位,当时国内很多大模型都会被质疑是 套壳Llama,谁能想到几年过去,中国开源模型崛起,Meta反而成了真·套壳的那个。 并且,Meta彻底违背初心,抛弃开源路线。 牛油果大模 ...
AD智驾的2025年:监管刹车、技术狂飙,“地大华魔”四雄争霸
3 6 Ke· 2025-12-11 09:55
图源:长城汽车 今天,我们站在2025年的尾巴上回顾这一年的变化,可以看到那些曾经靠"零接管"神话圈粉的造车新势力,不得不把PPT上的科幻片改成纪实纪录片;那些 曾在算力军备竞赛中狂奔的供应链,也开始学会在边界内深耕。 浮华渐褪,本真显现。智能驾驶虚假宣传被严打后,如今行业怎么样了? 官方祛魅后,智驾技术进步更快了 要理解这场智驾减速大戏,我们需要把时间线调到2025年春天。 4月16日,工信部装备工业一司召开智能网联汽车准入管理推进会,一纸禁令掐住了行业多年的吹牛习惯。会议要求"不得进行夸大和虚假宣传,严格履行告 知义务",并且需要将"组合驾驶辅助"定为官方表述。 2025年即将结束,这一年关于汽车行业有很多关键词,其中之一就是"自动驾驶踩刹车"。 今年春天,工信部一纸公文,将"自动驾驶"列为禁词,车企们宣传的"L2.999"文字游戏被戳破,中国智能驾驶产业被迫从一场持续三年的技术狂欢中清醒过 来。安全最终压倒了速度,责任取代了噱头。 变化是立竿见影的。首先是车企宣传话术风格的剧变。电车通随机查找了国内主流车企的官网,发现"自动驾驶"一词出现频率大幅下降,基本已经消失,取 而代之的是"辅助驾驶""智驾辅助" ...
自驾世界模型剩下的论文窗口期没多久了......
自动驾驶之心· 2025-12-11 00:05
Core Insights - The article highlights the recent surge in research papers related to world models in autonomous driving, indicating a trend towards localized breakthroughs and verifiable improvements in the field [1] - It emphasizes the importance of refining submissions to top conferences, suggesting that the final 10% of polishing can significantly impact the overall quality and acceptance of the paper [2] - The platform "Autonomous Driving Heart" is presented as a leading AI technology media outlet in China, with a strong focus on autonomous driving and related interdisciplinary fields [3] Summary by Sections Research Trends - Numerous recent works in autonomous driving, such as MindDrive and SparseWorld-TC, reflect a focus on world models, which are expected to dominate upcoming conferences [1] - The article suggests that the main themes for the end of this year and the first half of next year will likely revolve around world models, indicating a strategic direction for researchers [1] Guidance and Support - The platform offers personalized guidance for students, helping them navigate the complexities of research and paper submission processes [7][13] - It claims a high success rate, with a 96% acceptance rate for students who have received guidance over the past three years [5] Faculty and Resources - The platform boasts over 300 dedicated instructors from top global universities, ensuring high-quality mentorship for students [5] - The instructors have extensive experience in publishing at top-tier conferences and journals, providing students with valuable insights and support [5] Services Offered - The article outlines various services, including personalized paper guidance, real-time interaction with mentors, and comprehensive support throughout the research process [13] - It also mentions the potential for students to receive recommendations from prestigious institutions and direct job placements in leading tech companies [19]
中国AI走出差异化务实之路
Zhong Guo Qing Nian Bao· 2025-12-10 07:28
"美国靠资本市场优势赌AGI,中国则在性价比和产业应用中找机会。" "如果大模型不能实现AGI(通用人工智能),那么当前美国大模型公司在算力上的海量投入,大概率 短期内是算不过账来的,这可能是当下最大的泡沫。"上海未来产业基金总经理、上海未来启点社区理 事长魏凡杰的判断,道出了当前AI领域的争议核心。 当全球AI投资经历了3年多的"狂热"后,"泡沫论"的声音最近在美国此起彼伏。11月9日高盛发布研报 称:AI领域出现了类似于互联网泡沫破裂前的五项危险征兆,甚至泡沫程度更高。北京大学新结构经 济学研究院院长林毅夫日前在第十届复旦首席经济学家论坛上表示,"十五五"期间,美国很可能出现人 工智能泡沫的破灭,且可能像2008年美国房地产市场泡沫那样,给美国带来金融危机甚至全世界的经济 危机。 11月29日,清华大学FIT楼,在2025年中国人工智能大会暨全国人工智能学院院长(系主任)年会上海 未来启点社区分论坛:觉醒之境——AI的下一代基础方程研讨会上,30余位来自学术界、产业界、投 资界的专业人士多角度勾勒出中国AI"挤泡沫、练内功、出实绩"的发展图景——不赌虚无缥缈的概念, 而是聚焦底层创新、产业适配和性价比提升 ...
读懂2025中国AI走向!公司×产品×人物×方案,最值得关注的都在这里了
量子位· 2025-12-10 04:26
组委会 发自 MEET2026大会现场 量子位 | 公众号 QbitAI 如果把2025年的AI故事看作一条时间长河,那么最惊雷的一声在年初就已炸响—— DeepSeek-R1于1月横空出世,震出全年主旋律;刚刚进入12月,DeepSeek又开源了V3.2系列,一头一尾,把这一整年的技术叙事包了起 来。 2025年关乎AI模型的主线故事,沿着开源与性能的双线竞速徐徐展开,开源模型和闭源旗舰你追我赶,在参数规模之外把边界一路推到推理 效率、训练范式和成本结构;另一边,世界模型从论文概念变成真实产品与公司战场,李飞飞与Yann LeCun分别押注各自路线,把"通向AGI 的路"指向世界模型之争。 具身智能机器人和搭载的模型爆发式迭代,其它AI终端设备——AI玩具、AI手机、AI PC、智能座舱等——全面铺开,成为AI能力落到现实世 界的最佳试验田。 2025人工智能年度领航企业、2025人工智能年度潜力创业公司、2025人工智能年度杰出产品、2025人工智能年度杰出解决方案、2025人工 智能年度焦点人物等5大维度奖项悉数颁出。 本次评选于今年9月启动。在过去3个月里,共有数百家企业、机构和个人报名参与评选。 最 ...
安向京:无人驾驶终端具身移动 是充满想象力的新赛道
Xin Lang Cai Jing· 2025-12-10 02:37
12月8日-9日,在2025地平线技术生态大会期间,行深智能CEO 安向京 莅临新浪汽车高端访谈间时表示:未来不再是送一个一个具体的东西,而是实现空间 转移平台的管理,把物流变成空间转移。不同的物流企业,甚至能服务快递、生鲜、烟草、预制菜等等广大的物流城配体系,甚至更进一步可以服务环卫、 服务安防,甚至包括煤气泄露的巡检等等一系列的应用,所有的终端移动的应用或者是具身移动的应用,都可以被无人驾驶的能力所覆盖和赋能,这个就是 非常有想象力的空间和有想象力的赛道。 以下为专访实录 新浪汽车:感谢安总来到新浪汽车的直播间,安总简单地和大家打个招呼。 安向京:大好!我是行深智能的安向京,我们行深智能是2017年成立的,到现在已经有八年了,我们聚焦在L4的末端无人物流赛道上。 新浪汽车:您刚才说到了末端无人赛道L4级,这和我们理解的最后一公里有什么具体的场景吗? 行深智能CEO 安向京(右) 安向京:对,末端在物流领域大概是这么分,分干线物流、支线物流和末端物流,所以说末端物流基本上涵盖了城配以及您刚才说的最后一公里,甚至最后 50米所有的场景,所以末端物流的概念相对可能比最后一公里要大一点,一般最后一公里的概念都是老 ...
澳门大学首个世界模型驱动的视觉定位框架!
自动驾驶之心· 2025-12-10 00:04
点击下方 卡片 ,关注" 自动驾驶之心 "公众号 戳我-> 领取 自动驾驶近30个 方向 学习 路线 >>自动驾驶前沿信息获取 → 自动驾驶之心知识星球 论文作者 | Haicheng Liao等 编辑 | 自动驾驶之心 在自动驾驶的交互场景中,最尴尬的时刻莫过于此: 乘客指着前方复杂的路口说:"跟着那辆SUV"。自动驾驶系统看着眼前三辆长得差不多的车,内心OS:"哪辆?是左边那辆?还是正在变道那辆?" 现有的自动驾驶视觉定位(Visual Grounding)模型,大多像是一个" 只会看图说话 "的愣头青。它们盯着当前的这一帧画面,试图从 像素 里找答案。一旦指令模糊, 或者目标被遮挡,它们就很容易"指鹿为马",甚至引发错误推理。 人类司机为什么不会弄错?因为我们会" 预判 "。 当我们听到指令时,大脑里会瞬间推演未来的画面:左边那辆车马上要转弯了,不符合"跟着"的语境;只有中间那辆车在加速直行,才是最可能的意图。 "在行动之前,先思考未来"。 受此启发,来自[澳门大学]的研究团队提出了全新的框架 ThinkDeeper。这是首个将世界模型(World Model)引入自动驾驶视觉定位的研究。这项工作不仅刷 ...