Workflow
多智能体系统
icon
Search documents
智源研究院发布2026十大AI技术趋势,AI将从数字世界迈入物理世界
Sou Hu Cai Jing· 2026-01-09 05:48
Core Insights - The report by Beijing Zhiyuan Artificial Intelligence Research Institute outlines a significant shift in AI development from parameter scaling in language learning to a deeper understanding and modeling of the physical world, indicating a paradigm shift in industry technology [1][3] Group 1: Key Trends in AI Development - The transition from "predicting the next word" to "predicting the next state of the world" signifies the emergence of the Next-State Prediction (NSP) paradigm, which is expected to drive AI from digital perception to physical cognition and planning [4][5] - The report identifies 2026 as a critical turning point for AI, marking the transition from digital to physical applications and from technical demonstrations to scalable value [3][4] Group 2: Cognitive and Physical Integration - AI is moving towards a higher cognitive paradigm, focusing on world models and NSP, which will provide a new cognitive foundation for complex tasks such as autonomous driving and robotics [4][5] - The concept of "embodied intelligence" is evolving from laboratory demonstrations to real-world industrial applications, with humanoid robots expected to enter actual production scenarios by 2026 [5][6] Group 3: Multi-Agent Systems and Collaboration - The standardization of communication protocols for multi-agent systems (MAS) is crucial for solving complex problems, enabling agents to collaborate effectively in various fields such as research and industry [6][7] - The role of AI in research is shifting from a supportive tool to an autonomous "AI scientist," which will accelerate the development of new materials and pharmaceuticals [7][8] Group 4: Market Dynamics and Applications - The competition for consumer AI applications is intensifying, with major tech companies developing integrated AI portals, exemplified by Ant Group's multimodal AI assistant and health applications [8][9] - The enterprise AI sector is entering a "trough of disillusionment" due to challenges like data and cost, but a recovery is anticipated in the second half of 2026 as data governance and toolchains mature [9][10] Group 5: Data and Performance Optimization - The reliance on synthetic data is increasing as high-quality real data becomes scarce, particularly in fields like autonomous driving and robotics, where synthetic data generated by world models will be key [10][11] - The efficiency of AI inference remains a critical focus, with ongoing innovations in algorithms and hardware expected to lower costs and enhance performance, facilitating the deployment of high-performance models in resource-constrained environments [11][12] Group 6: Open Source and Security - The development of a compatible software stack for heterogeneous chips is essential to break the monopoly on computing power and mitigate supply risks, with platforms like Zhiyuan FlagOS leading this initiative [12][13] - AI security risks are evolving from "hallucinations" to more subtle "systemic deceptions," prompting the need for comprehensive safety frameworks and research initiatives to address these emerging threats [13][14]
从“预测下一个词”到“预测世界状态”:智源发布2026十大 AI技术趋势
Sou Hu Cai Jing· 2026-01-09 00:02
Core Insights - The core viewpoint of the report is that AI is transitioning from merely predicting language to understanding and modeling the physical world, marking a significant paradigm shift in technology [1][4][5]. Group 1: Key Trends in AI Technology - Trend 1: The consensus in the industry is shifting from language models to multi-modal world models that understand physical laws, with Next-State Prediction (NSP) emerging as a new paradigm [7]. - Trend 2: Embodied intelligence is moving from laboratory demonstrations to real-world industrial applications, with humanoid robots expected to transition to actual service scenarios by 2026 [8]. - Trend 3: Multi-agent systems are becoming crucial for solving complex problems, with the standardization of communication protocols like MCP and A2A facilitating collaboration among agents [9]. Group 2: Applications and Market Dynamics - Trend 4: AI is evolving from a supportive tool to an autonomous researcher, with the integration of scientific foundational models and automated laboratories accelerating research in new materials and pharmaceuticals [10]. - Trend 5: The competition for consumer AI super applications is intensifying, with major players like OpenAI and Google leading the way in creating integrated intelligent assistants [11]. - Trend 6: After a phase of concept validation, enterprise AI applications are entering a "valley of disillusionment," but a recovery is expected in the second half of 2026 as data governance improves [12]. Group 3: Data and Performance Enhancements - Trend 7: The reliance on synthetic data is increasing, which is crucial for model training, especially in fields like autonomous driving and robotics [13]. - Trend 8: Optimization of inference remains a key focus, with ongoing innovations in algorithms and hardware reducing costs and improving efficiency [15]. - Trend 9: The development of a heterogeneous software stack is essential to break the monopoly on computing power and mitigate supply risks [16]. Group 4: Security and Ethical Considerations - Trend 10: AI security risks are evolving from "hallucinations" to more subtle "systemic deceptions," necessitating a comprehensive approach to safety and alignment in AI systems [17]. Conclusion - The report outlines ten key AI technology trends that provide a clear anchor for future technological exploration and industry layout, emphasizing the importance of collaboration across academia and industry to drive AI towards a new phase of value realization [18].
智源2026十大趋势预测:AI在物理世界「睁眼」
Sou Hu Cai Jing· 2026-01-08 16:08
AIPress.com.cn报道 当大模型不再仅仅满足于预测下一个汉字,而是试图预测世界的下一个状态时,人工智能才真正开始理解因果,触摸现实。这是未来,也是2026年AI即 将发生的变化。 本文结合智源研究院提出的AI十大趋势预测,梳理了AI在2026的将有之变,相信能够为我们勾勒了一幅从虚拟走向实体、从单体走向群智的未来图景。 图说:智源研究院 2026十大AI技术趋势 趋势一:世界模型确立认知新范式 行业对于智能的理解,正经历一场静水流深的转变,共识正从单一的语言模型,转向能够理解物理规律的多模态世界模型。 Next-State Prediction(NSP)范式的确立,标志着AI不再仅仅满足于在文本中预测下一个词汇,它开始尝试预测世界的下一个状态。 正如智源悟界所验证的那样,当机器掌握了时空连续性与因果关系,它便跨越了感知的边界,触碰到了真正的认知与规划。 趋势二:具身智能的产业"出清"与落地 趋势五:新"BAT"格局下的垂直突围 C端超级应用的"All in One"入口成为兵家必争之地。海外有OpenAI与Google引领风骚,国内字节、阿里、蚂蚁等巨头亦依托生态积极布局。 我们可以看到,蚂蚁推出的 ...
平安基金2026年策略会观点揭晓 聚焦科技与周期双主线布局
Zhong Zheng Wang· 2026-01-08 13:28
中证报中证网讯(记者张韵)1月7日,平安基金举办2026年投资策略会,明确将科技创新与周期品供需再 平衡列为两大投资主线。其中,科技领域聚焦全球AI资本开支高速增长带动的硬件创新机会,以及自 主可控叠加国内AI产业需求带动的国产半导体产业链投资机会;周期领域关注供给约束良好、需求温 和复苏的品种,如化工和工业金属等。 平安基金权益投资总监神爱前在策略会上表示,展望2026年,预期政策持续发力、经济温和复苏、流动 性保持充裕、内外部环境逐渐改善,这些因素有望驱动市场行情稳步延续。相较于2025年,2026年上涨 驱动力或将更多来自盈利驱动与行业催化。看好业绩驱动的泛科技、泛周期两条线索投资机会。同时, 他分享道,平安基金近年全面升级了投研体系。通过"一个理念(坚持做长期正确的事)、两大驱动(人才 与平台相互成就)、三多策略(多团队、多风格、多策略)、四真机制(真机制、真团队、真人才、说真 话)"的平台化路径,构建投研竞争力。 平安科技创新混合基金经理翟森表示,AI基础设施建设远未达到需要讨论泡沫的阶段。从历史上看, 生产力革命的资本开支高峰期一般会达到全社会GDP的3%-4%,而2026年的AI资本开支预计规模 ...
智源发布2026十大 AI技术趋势:认知、形态、基建三重变革,驱动AI迈入价值兑现期
Zhong Guo Jing Ji Wang· 2026-01-08 10:00
行业共识正从语言模型转向能理解物理规律的多模态世界模型。从"预测下一个词"到"预测世界下一状态",NSP范式标志着AI开始掌握时空连续性与因果关 系。 趋势2:具身智能迎来行业"出清",产业应用迈入广泛工业场景 具身智能正脱离实验室演示,进入产业筛选与落地阶段。随着大模型与运动控制、合成数据结合,人形机器人将于2026年突破Demo,转向真实的工业与服 务场景。具备闭环进化能力的企业将在这一轮商业化竞争中胜出。 中国经济网北京1月8日讯(记者彭金美)8日,北京智源人工智能研究院(以下简称"智源研究院")发布年度报告《2026十大AI技术趋势》。报告指出,人工智能 的演进核心正发生关键转移:从追求参数规模的语言学习,迈向对物理世界底层秩序的深刻理解与建模,行业技术范式迎来重塑。 智源研究院2026十大AI技术趋势 趋势1:世界模型成为AGI共识方向,Next-State Prediction或成新范式 趋势3:多智能体系统决定应用上限,Agent时代的"TCP/IP"初具雏形 复杂问题的解决依赖多智能体协同。随着MCP、A2A等通信协议趋于标准化,智能体间拥有了通用"语言"。多智能体系统将突破单体智能天花板,在 ...
智源研究院发布2026十大AI技术趋势
Jing Ji Guan Cha Wang· 2026-01-08 09:08
趋势3:多智能体系统决定应用上限,Agent时代的"TCP/IP"初具雏形 复杂问题的解决依赖多智能体协同。随着MCP、A2A等通信协议趋于标准化,智能体间拥有了通用"语 言"。多智能体系统将突破单体智能天花板,在科研、工业等复杂工作流中成为关键基础设施。 趋势4:AI Scientist成为AI4S北极星,国产科学基础模型悄然孕育 AI在科研中的角色正从辅助工具升级为自主研究的"AI科学家"。科学基础模型与自动化实验室的结合, 将极大加速新材料与药物研发。报告强调,我国需整合力量,加快构建自主的科学基础模型体系。 趋势5:AI时代的新"BAT"趋于明确,垂直赛道仍有高盈利玩法 经济观察网2026年1月8日,北京智源人工智能研究院发布年度报告《2026十大AI技术趋势》。报告指 出,人工智能的演进核心正发生关键转移:从追求参数规模的语言学习,迈向对物理世界底层秩序的深 刻理解与建模,行业技术范式迎来重塑。 趋势1:世界模型成为AGI共识方向,Next-State Prediction或成新范式 行业共识正从语言模型转向能理解物理规律的多模态世界模型。从"预测下一个词"到"预测世界下一状 态",NSP范式标志着 ...
硅谷顶尖风投 a16z 2026 大构想:从 AI 到现实世界的全面重塑 | RockFlow 解读
RockFlow Universe· 2025-12-18 10:39
划重点 ① AI 正在从"数字助理"进化为"自主执行集群"。2026 年将见证 AI 从"对话工具"向"多智能体系 统(Multi-Agent)"的跨越。a16z 预言屏幕时代即将终结,Agent 原生基建将重定义云端速度, 开启企业运营杠杆的历史性飞跃。 ② 科技正在溢出屏幕,"比特"开始全面接管"原子"。电气化、材料科学与 AI 融合而成的"电子 工业堆栈"将成为物理世界运行的底层逻辑。通过软件定义制造与 AI 自动化,美国有望迎来工 厂复兴的黄金时代。 ③ SaaS 正经历从"被动记录"到"主动推理"的范式转移,个性化服务将实现从"为所有人优 化"到"为每个人定制"的飞跃。加密货币将化身为互联网的基础结算层,稳定币与 RWA 将重构 金融底层;而预防性医疗将开启长效变现的新蓝海。 RockFlow 本文共2810字, 阅读需约10分钟 想进群看深度投研?现在就行动! 点赞+推荐+评论本文,截图发给 Fafa(微信:rockflowfafa),马上开通 RockFlow 社群资格。 额外福利:年度 AI 报告合集,先到先得。 在美股市场,预见趋势的能力往往决定了 Alpha 的成色。作为硅谷风投界的"定海 ...
Agent微调复活?英伟达开源8B新模型带飞GPT-5:在HLE狂卷37分,还把成本打下来
量子位· 2025-12-07 04:35
Core Insights - The article introduces a new paradigm in AI model orchestration, utilizing a smaller 8B model as a conductor to coordinate various tools and larger models, achieving better performance at lower costs [1][13]. Group 1: Model Performance - The Orchestrator-8B model achieved a score of 37.1% in the Humanity's Last Exam, outperforming GPT-5, which scored 35.1%, while also reducing computational costs by 2.5 times [1][9]. - In the FRAMES benchmark, Orchestrator-8B scored 76.3, compared to GPT-5's 74.0, and in the τ²-Bench, it scored 80.2 against GPT-5's 77.7 [9][10]. - The average cost for Orchestrator-8B was only 9.2 cents, with a latency of 8.2 minutes, significantly lower than GPT-5 [9][10]. Group 2: ToolOrchestra Framework - ToolOrchestra integrates various tools into a unified JSON interface, allowing the 8B conductor to think, call, and read feedback in multiple rounds until convergence [4]. - The framework employs GRPO reinforcement learning to maximize three rewards: correctness, efficiency, and user preference [4][5]. Group 3: User Preferences and Biases - The article highlights two biases in large models: self-enhancing bias, where models prefer to call upon similar models, and blind reliance on the strongest models, leading to increased costs [4][5]. - User preferences are taken into account, allowing the conductor to balance between local and cloud searches, speed, and cost [5][15]. Group 4: Application Scenarios - The Orchestrator-8B can be applied in various scenarios, such as internal Q&A and report analysis, where it defaults to local indexing and code execution for 80% of tasks [16]. - In research and development, it can set time and cost limits while considering source preferences [16]. - The framework allows for an end-to-end orchestration of functions and tools, moving away from rigid programming structures [16]. Group 5: Future Directions - The paper has made all code, models, and datasets publicly available for academic and industrial follow-up [14]. - The approach emphasizes a shift from relying solely on the strongest models to a more efficient use of diverse tools and models, enhancing cost-effectiveness and performance [15].
谷歌抢跑L3级AI,Gemini连续工作40分钟,Agent自动生成评审百条创意
量子位· 2025-11-19 01:37
Core Insights - Google is advancing towards L3 AI with its Gemini system, which can autonomously execute tasks for extended periods, marking a significant step in AI development [27][30][32]. Group 1: Gemini's Capabilities - Gemini can continuously operate for 40 minutes on a single task, showcasing its ability to handle complex processes [2][19]. - The system generates over 100 creative ideas based on user input, which are then evaluated and ranked by multiple agents, providing structured feedback [3][15]. - Users only need to make final decisions, as the exploration and iteration processes are managed by the agents, significantly reducing the time spent on refining outputs [4][11]. Group 2: Multi-Agent System - The multi-agent competition system integrates long-term thinking and adversarial generation, enhancing the quality of outputs by utilizing time effectively [10][12]. - This system allows for a comprehensive generation, competition, and selection process, resulting in a well-rounded set of ideas presented to users [15][20]. - Gemini for Enterprise includes applications for creative generation and collaborative research, demonstrating its versatility in different contexts [18][26]. Group 3: Future of AI - The development of L3 AI is characterized by the ability to autonomously run tasks over extended periods, with Gemini's capabilities aligning closely with this definition [30][32]. - Speculations suggest that future agents may be able to operate for even longer durations, potentially up to 3 hours by next year [33]. - As collaborative research features evolve, Gemini may reach L4 AI status, further enhancing its capabilities [37].
用「传心术」替代「对话」,清华大学联合无问芯穹、港中文等机构提出Cache-to-Cache模型通信新范式
机器之心· 2025-10-29 07:23
Core Insights - The article discusses the rapid advancements in large language models (LLMs) and the introduction of a new communication paradigm called Cache to Cache (C2C), which enhances multi-agent systems by allowing direct communication through KV-Cache instead of traditional Text to Text (T2T) methods [2][5][10]. Limitations of Existing Text Communication - T2T communication faces significant limitations, including information loss due to dimensionality reduction, semantic ambiguity inherent in natural language, and substantial delays caused by token-by-token output generation [7][8][6]. Advantages of KV-Cache - KV-Cache inherently contains multi-dimensional semantic information from the dialogue process, improving accuracy and efficiency. Experiments show that optimized KV-Cache can significantly enhance model accuracy and facilitate effective communication between different models [11][12][29]. C2C Mechanism - The C2C framework utilizes a fusion mechanism that integrates KV-Cache from different models, ensuring compatibility and effective information transfer. This involves a residual fusion structure to maintain the original semantics of the receiver model [16][17][19]. Performance and Efficiency - C2C demonstrates substantial performance improvements over T2T, with accuracy increases of 3% to 5% and speed enhancements of up to two times. The framework allows for efficient parallel processing, avoiding the inefficiencies of one-dimensional text output [29][31][28]. Experimental Results - The article presents various experimental results showing that C2C consistently outperforms T2T across multiple benchmarks, with significant accuracy gains and reduced inference times [28][31][29]. Future Prospects - The C2C paradigm has broad applications, including enhancing collaboration in multi-agent systems, integrating multimodal models, and improving privacy-aware cloud-edge collaboration. It is positioned as a key enabling technology for the next generation of multi-agent systems [36][38][39].