Workflow
世界模型
icon
Search documents
聊一聊AI硬件和软件
傅里叶的猫· 2026-01-09 15:58
Group 1: AI Hardware Market - The recent performance of AI hardware is not strong, but the US stock market's hardware sector showed some resilience [1] - The memory shortage is exaggerated; a report from Macquarie suggests that the new DRAM capacity in the next two years can only support about 15GW of AI data center construction, which may delay global AI expansion plans [3] - A different perspective from a memory industry expert indicates that the capacity could support 20GW and 33GW this year and next year, respectively [5] - The global data center installation capacity is projected to reach 17.4GW by 2025, with an expected increase to 30.2GW this year [5] - Due to memory constraints, the growth of AI data centers (AIDC) will not be as rapid as anticipated, contributing to the recent decline in hardware market sentiment [7] Group 2: AI Software and Applications - The AI software and application market is exceeding many expectations, with a positive outlook for AI applications this year [8] - The government is intensifying support for AI policies, with initiatives in various sectors like healthcare, education, and manufacturing, aiming for quantifiable goals by 2026 [9] - Major tech companies are competing for AI traffic entry points and ecosystem development, with strategies focusing on both consumer (C-end) and business (B-end) markets [10][11] - For the C-end, companies are enhancing user engagement and monetization capabilities, while for the B-end, they are driving cloud revenue through developer ecosystems [12] - The competition has extended to physical scenarios, with companies like Waymo and Tesla accelerating their efforts in ROBOTAXI [13] - Key technological advancements in AI models are expected to focus on world models, native multimodality, and self-evolving agents, with significant breakthroughs anticipated by 2026 [14][15] - The core competitiveness of AI application companies lies in their ability to integrate technology quickly and effectively into specific scenarios, achieving commercial viability [15]
CES观察|能跑跳、能干活、能签单,中国人形机器人站上C位
Bei Ke Cai Jing· 2026-01-09 09:35
Core Insights - The CES 2026 showcased a significant dominance of Chinese companies in the robotics sector, with 149 out of 598 exhibitors being Chinese, accounting for nearly one-quarter of the total [1] - In the humanoid robot segment, 21 out of 38 exhibitors were from China, representing over half of the participants [1] - The event highlighted advancements in various applications of humanoid robots, including industrial, commercial, and home companionship [2][8] Group 1: Company Highlights - Companies like Yushu Technology and Zhiyuan Robotics demonstrated their capabilities in motion control and application scenarios, showcasing robots capable of dance and combat [4] - The Beijing Humanoid Robot Innovation Center made its debut, emphasizing the importance of showcasing the practical capabilities of humanoid robots to enhance international influence [8] - The company Lingqiao Intelligent presented its high-performance dexterous hand, significantly reducing costs to make it more accessible for research institutions and startups [10] Group 2: Technological Advancements - The event featured robots that integrated advanced AI capabilities, such as language interaction and autonomous sorting, demonstrating their practical applications in real-world scenarios [6][7] - Breakthroughs in tactile technology were showcased, with companies like Pasini Sensory Technology presenting advanced multi-dimensional tactile sensors [11] - The development of world models as data engines in embodied intelligence was highlighted, indicating a shift towards more sophisticated evaluation and reinforcement learning environments [11] Group 3: Market Expansion and Sales - The CES 2026 served as a critical platform for Chinese robotics companies to expand their global market presence, with several companies reporting immediate sales during the event [8] - Companies like Songyan Power are shifting their focus from product display to commercial implementation, targeting key regions for market expansion [8] - The event underscored the importance of a well-defined and efficient industrial ecosystem, indicating a maturation of the robotics industry [9][12]
马斯克diss英伟达自动驾驶:再等五六年
Sou Hu Cai Jing· 2026-01-09 08:00
一个是"无所不能、牵引全球科技发展潮流"的钢铁侠,一个是手握人工智能核心算力的皮衣教主,也都与美国白宫关联密切,硅谷的两任科技偶像,马斯 克与黄仁勋,终于有了争锋的现实条件。 就在2026年的CES消费电子展上,英伟达发布Alpamayo自动驾驶平台,向世界展示的"AI推理"时刻,其实在C次元看来是招揽汽车公司作为客户,本质上 就是做又一家汽车界的安卓。 本来大家以为,特斯拉做整车,英伟达卖芯片,八竿子打不到一起去竞争。 然而自从老马存了"对其他汽车制造商推销FSD完全自动驾驶系统"的心思,就注定他和皮衣黄"早晚有一战"。更何况,谁才是真正的科技教父,在白宫那 边丢了场子之后,老马一定不想再输一场。 "嗯,那正是特斯拉正在做的事。"马斯克在X平台上如是评价英伟达打造Alpamayo,看起来就有点不屑,其实一点儿都不是夸奖,"他们会发现,达到 99%很容易,但要解决分布的长尾部分则超级困难。" 英伟达"授人以渔":VLA与思维链 要理解马斯克的"不屑",必须先看懂英伟达此次抛出的究竟是何物。Alpamayo并非一个可以直接装车上路的完整自动驾驶系统,而是一套开发范式与基 础设施。 其核心创新,在于首次将视觉-语 ...
最前线|吉利发布全域AI2.0架构和世界行为模型,“1-2周可迭代一次”
3 6 Ke· 2026-01-09 07:34
超大模型的智能辅助驾驶路线,已经成为车企共识,理想、小鹏抢滩VLA之后,吉利也推出了自己的 世界模型。 1月5日,吉利在全球消费电子盛宴CES开幕前夕,宣布其全域AI技术体系升级到2.0时代,标志性的技 术成果就是WAM世界行为模型(World Action Model)。 以华为代表的企业,则宣称世界模型才是终极路线,认为辅助驾驶系统对物理环境的感知到动作输出, 不需要经过语言模型转录,从而导致延时和信息损失。 李传海告诉36媒体,比起VLA、WM这些模型,吉利WAM是增强型的"世界模型"。"尤为关键的是,吉 利的WAM还把基于沃尔沃的安全大数据,训练到模型中,同时融合了整车各域的参数,还有对互联网 生态的一些感知层面的要素,构建起能够实现整车各域融合的统一"整车通用大脑"。 吉利汽车集团CEO淦家阅也解释,吉利WAM也与市面上的WM(世界模型)有区别,"目前行业里大部 分的WM就是根据我们实际所见到的场景来进行模拟。但很多极端场景我们没有见到,只能设想,所以 吉利通过非常庞大的数据,来模拟一些更极端、很罕见的场景。" "WAM迭代的能力很强,甚至可以做到一周到两周之内就可以迭代一次,大家可以看到吉利的辅助 ...
前华为天才少年首发声,国产智能或实现量产,多机协同是未来关键
Sou Hu Cai Jing· 2026-01-09 06:41
哈喽,大家好,小圆这篇解读,主要聚焦前华为天才少年李元庆的独家专访,核心就是想跟大家聊 聊"中国造首款规模化具身智能产品"的可能性,以及他力挺的"多机异构"为啥是未来方向。 作为具身智能领域的顶尖人才,李元庆从华为离职加入乐享科技后的表态,无疑为行业发展提供了重要 参考,接下来咱们就系统梳理下他的核心观点,再说说小圆的一些理解。 世界模型与数据工厂的互补价值 技术多点突破 2025年具身智能赛道热潮涌动,科技巨头加码、初创公司拿融资成常态,在李元庆看来,这股热潮不是 凭空出现的,核心是未来发展的确定性倒推而来,资本市场遵循长期逻辑,清楚优质硬件产品和产业成 熟需要周期。 而且一二级市场联动明显,上市公司布局机器人领域,既能赋能传统制造业、提升投入产出比,还能打 造第二增长曲线、盘活团队,再加上国内政策导向的助力,赛道热度自然居高不下,更关键的是技术成 熟度的显著提升。 记得之前看2024年的人形机器人,轻轻一推就倒,稍微复杂点的环境就"歇菜",完全是实验室里的演示 品,但2025年不一样了,宇树等企业的机器人越来越"抗造",踢一脚、摔一下都不会倒,脚滑了也能稳 住,甚至能跳跳舞、打打武术。 值得一提的是,大模 ...
当我们把3DGS在工业界的应用展开后......
自动驾驶之心· 2026-01-09 06:32
点击下方 卡片 ,关注" 自动驾驶之心 "公众号 戳我-> 领取 自动驾驶近30个 方向 学习 路线 最近在复盘各家如何使用3DGS的,今天和大家盘一下理想在这方面的工作。理想对世界模型的定义在 重建+生成 ,主要利用重建技术(3DGS)建模自动驾驶场 景(动态/静态/车辆资产),再利用生成方法实现闭环仿真或者场景生成。 所以这里面核心的技术是3DGS和生成。 在重建方面,主要有以下工作: ECCV2024中稿的StreetGaussian,开启了自动驾驶场景重建的浪潮; 发布3DRealCar大规模车辆资产重建数据集; 3DGS训练加速算法Balanced3DGS,近八倍提速; ICCV2025中稿的Hierarchy UGP,自动驾驶场景重建。 具有时空一致性的多风格自动驾驶场景生成算法StyledStreets。 场景重建或者说闭环仿真为什么这么重要?这是因为以往车端测试主要还是依赖实车测试,很多corner case无法复现,在传统的仿真环境domain gap也比较大。 3DGS高保真的场景重建能力和可编辑能力让这些问题的解决变得可能。 沿着3DGS的发展路径,我们看到一条清晰的路线: 静态重建→动 ...
让世界模型推理效率提升70倍:上海AI Lab用“恒算力”破解长时记忆与交互瓶颈
量子位· 2026-01-09 04:09
Core Insights - The article discusses the transition of generative AI from static images to dynamic videos, emphasizing the importance of building a "world model" that understands physical laws, possesses long-term memory, and supports real-time interaction as a pathway to achieving Artificial General Intelligence (AGI) [3]. Group 1: Yume Project Overview - The Yume project, developed by Shanghai AI Lab in collaboration with several top institutions, has released Yume1.0 and Yume1.5, which are the first fully open-source world models aimed at real-world applications [3][4]. - Yume1.5 introduces a core architectural innovation called Time-Space Channel Modeling (TSCM), which addresses the memory bottleneck in long video generation [4][11]. Group 2: Technical Innovations - TSCM employs a unified context compression and linear attention mechanism to solve the memory challenges associated with long video generation [5]. - The framework integrates long-term memory, real-time reasoning, and "text + keyboard" interaction control into a single system, demonstrating a feasible path for engineering world models [2]. Group 3: Data Utilization - Yume utilizes the Sekai dataset, which includes high-quality first-person (POV) video data covering 750 cities and totaling 5000 hours [8]. - Yume1.5 also incorporates a high-quality T2V synthesis dataset and a specialized event dataset for generating events like "sudden ghost appearances" [10]. Group 4: TSCM Mechanism - TSCM's compression mechanism includes two parallel streams: time-space compression and channel compression, effectively reducing the number of tokens processed [16]. - Time-space compression retains visual details by downsampling historical frames, while channel compression reduces the channel dimension to enhance processing efficiency [19][23]. Group 5: Performance Evaluation - Yume1.5 achieved an instruction-following (IF) score of 0.836, demonstrating the effectiveness of its control methods, and reduced generation time from 572 seconds in Yume1.0 to just 8 seconds [29]. - An ablation study showed that removing TSCM and using simple spatial compression led to a decrease in instruction-following ability from 0.836 to 0.767, highlighting TSCM's significance [30][32]. Group 6: Future Prospects - The open-sourcing of Yume and its datasets is expected to accelerate research in world models, with the potential for the distinction between "real" and "generated" content to become increasingly blurred in the near future [38].
智源研究院发布2026十大AI技术趋势:“技术泡沫”是假命题
Xin Jing Bao· 2026-01-09 03:52
Core Insights - The Beijing Zhiyuan Artificial Intelligence Research Institute has released its predictions for the top ten AI technology trends for 2026, focusing on foundational models, AI applications, and key industries [1] Group 1: Foundational Models - The institute believes that world models will become a consensus direction for AGI, as high-quality text data is nearly exhausted. AI must learn not only language but also the rules governing the physical world, necessitating the processing of multimodal information such as images, sounds, time, and space [3] - In the realm of embodied intelligence, the number of companies has exceeded 230, but many exhibit homogeneity in their business models, potentially leading to industry "clearing." The introduction of world models may serve as a crucial technological anchor for the next stage of embodied intelligence [3] Group 2: Consumer Applications - The competition in consumer AI applications is becoming clearer, with a focus on "super applications" characterized by "All in One" functionality, moving beyond single-tool attributes to create a closed loop from information acquisition to task planning and problem-solving [3] - Despite the presence of major players in the general market, there are still opportunities for breakthroughs in high-barrier vertical fields such as health and education, where vertical applications demonstrate differentiated competitiveness [3] Group 3: Reasoning Capabilities - The institute asserts that the notion of a "technology bubble" is a false proposition, as reasoning optimization has not yet reached its ceiling. Progress in this area will remain a key factor supporting the large-scale application of AI in 2026 [4]
智源《2026十大 AI技术趋势》:“技术泡沫”是假命题,具身智能将迎行业“出清”
2026年1月8日,北京智源人工智能研究院(以下简称"智源研究院")发布年度报告《2026十大AI技术趋 势》(以下简称《趋势》)。智源研究院院长王仲远在现场指出,AI基础模型的竞争焦点,已从"参数 有多大"转变为"能否理解世界如何运转",AI正从"预测下一个词"跨越到"预测世界的下一个状态"。 智源研究院分析指出,这一转变是由三条清晰的主线驱动的。一是认知范式的"升维",AI开始学习物理 规律,这为自动驾驶仿真、机器人训练等复杂任务提供了全新的"认知"基础,成为国内外领先模型厂商 竞相布局的战略高地。 二是智能形态的"实体化"与"社会化"。智能正从软件走向实体,从单体走向协同。头部科技公司的人形 机器人正进入真实生产场景,标志着"具身智能"走出实验室。同时,主流Agent通信协议的标准化,让 多智能体(MAS)能够以"团队"形式攻克科研、工业等复杂任务流。 三是价值兑现的"双轨应用",即在消费端,一个"All in One"的超级应用入口正在形成,国内外科技巨头 基于各自生态积极构建一体化AI门户;在企业端,经历早期概念验证的"幻灭期"后,AI正凭借更好的数 据治理与行业标准接口,在垂直领域孕育出真正可衡量 ...
智源2026十大趋势预测:AI在物理世界「睁眼」
Sou Hu Cai Jing· 2026-01-08 16:08
AIPress.com.cn报道 当大模型不再仅仅满足于预测下一个汉字,而是试图预测世界的下一个状态时,人工智能才真正开始理解因果,触摸现实。这是未来,也是2026年AI即 将发生的变化。 本文结合智源研究院提出的AI十大趋势预测,梳理了AI在2026的将有之变,相信能够为我们勾勒了一幅从虚拟走向实体、从单体走向群智的未来图景。 图说:智源研究院 2026十大AI技术趋势 趋势一:世界模型确立认知新范式 行业对于智能的理解,正经历一场静水流深的转变,共识正从单一的语言模型,转向能够理解物理规律的多模态世界模型。 Next-State Prediction(NSP)范式的确立,标志着AI不再仅仅满足于在文本中预测下一个词汇,它开始尝试预测世界的下一个状态。 正如智源悟界所验证的那样,当机器掌握了时空连续性与因果关系,它便跨越了感知的边界,触碰到了真正的认知与规划。 趋势二:具身智能的产业"出清"与落地 趋势五:新"BAT"格局下的垂直突围 C端超级应用的"All in One"入口成为兵家必争之地。海外有OpenAI与Google引领风骚,国内字节、阿里、蚂蚁等巨头亦依托生态积极布局。 我们可以看到,蚂蚁推出的 ...