原生全模态大模型 - filings, earnings calls, financial reports, news

原生全模态大模型

Search documents

Sou Hu Cai Jing· 2026-01-22 18:45

Core Insights - Baidu has officially launched the native multimodal large model Wenxin 5.0, featuring 2.4 trillion parameters and supporting various forms of input and output, including text, images, audio, and video [1] - Wenxin 5.0 has surpassed models like Gemini-2.5-Pro and GPT-5-High in over 40 authoritative benchmark evaluations, establishing itself in the top tier globally [1] Technical Advancements - Unlike most industry models that use "post-fusion" multimodal approaches, Wenxin 5.0 employs a unified autoregressive architecture for native multimodal modeling, allowing for joint training of multiple data types [2] - The model utilizes a large-scale mixture of experts structure with an activation parameter rate below 3%, enhancing inference efficiency while maintaining strong capabilities [2] - Significant breakthroughs in multimodal understanding, coding, and creative writing have been achieved, exemplified by the model's ability to generate executable front-end code and simulate classical literary styles [2] Industry Impact - The Wenxin Mentor program has expanded to include 835 experts from various sectors, contributing to the model's improvement in logical rigor, professional depth, and creative quality [3] - The launch of Wenxin 5.0 signifies the maturation and practicality of native multimodal technology, enhancing China's competitive edge in the global AI industry [3] - As of January 15, Wenxin 5.0 ranked first in the domestic text leaderboard and eighth globally in the LMArena, outperforming several mainstream models [3]

2.4万亿参数原生全模态大模型百度文心5.0正式版上线

Zheng Quan Shi Bao Wang· 2026-01-22 14:06

Core Insights - Baidu has officially launched the native multimodal large model Wenxin 5.0, featuring 2.4 trillion parameters and supporting various input and output formats including text, images, audio, and video [1][2] - The model employs a unified autoregressive architecture for native multimodal modeling, allowing for joint training of multiple data types, which enhances the integration and optimization of multimodal features [1] - Wenxin 5.0 has achieved significant breakthroughs in multimodal understanding, coding, and creative writing capabilities, showcasing its advanced intelligence and tool utilization [1][2] Technical Features - The model utilizes a large-scale mixture of experts structure with ultra-sparse activation parameters, maintaining strong performance while improving inference efficiency [1] - It incorporates end-to-end multi-round reinforcement learning training based on long-term task trajectory data, significantly enhancing the model's agent and tool calling capabilities [1] Market Position - Wenxin 5.0's launch signifies the maturation and practicality of native multimodal technology, reflecting the independent innovation capabilities of Chinese model manufacturers in the global AI industry [2] - As of January 15, Wenxin 5.0 ranked first in China and eighth globally in the LMArena text leaderboard, surpassing several mainstream models including GPT-5.1-High and Gemini-2.5-Pro [3]

百度发布文心5.0：参数规模超2.4万亿，用户可直接体验

Feng Huang Wang· 2025-11-13 04:37

依托飞桨深度学习框架，文心5.0采用了超稀疏混合专家架构，进行庞大的全模态训练，总参数规模超过2.4万亿，激活参数比例低于3%，在保持模型强大能力的同时有效提升推理效率。目前，文心大模型5.0 Preview已同步上线文心App，用户可直接体验；开发者和企业用户也可通过百度千帆大模型平台，调用文心大模型5.0 API服务。文心5.0基础能力全面升级，在多模态理解、指令遵循、创意写作、事实性、智能体规划与工具应用等方面表现突出，拥有强大的理解、逻辑、记忆和说服力。在40余项权威基准的综合评测中，其语言与多模态理解能力与Gemini-2.5-Pro、GPT-5-High等模型持平，图像与视频生成能力与垂直领域专精模型相当，达到全球领先水平，验证了原生全模态大模型的能力和潜力。百度首席技术官王海峰介绍，文心大模型5.0是新一代原生全模态大模型。不同于业界多数的多模态模型采用后期融合的方式，文心5.0的技术路线是采用统一的自回归架构进行原生全模态建模，理解与生成一体化。凤凰网科技讯，11月13日，在2025百度世界大会上，百度正式发布文心大模型5.0。据介绍，该模型参数量达2.4万亿，采用原生 ...