原生全模态大模型
Search documents
2.4万亿参数原生全模态大模型 文心5.0正式版上线
Sou Hu Cai Jing· 2026-01-22 18:45
Core Insights - Baidu has officially launched the native multimodal large model Wenxin 5.0, featuring 2.4 trillion parameters and supporting various forms of input and output, including text, images, audio, and video [1] - Wenxin 5.0 has surpassed models like Gemini-2.5-Pro and GPT-5-High in over 40 authoritative benchmark evaluations, establishing itself in the top tier globally [1] Technical Advancements - Unlike most industry models that use "post-fusion" multimodal approaches, Wenxin 5.0 employs a unified autoregressive architecture for native multimodal modeling, allowing for joint training of multiple data types [2] - The model utilizes a large-scale mixture of experts structure with an activation parameter rate below 3%, enhancing inference efficiency while maintaining strong capabilities [2] - Significant breakthroughs in multimodal understanding, coding, and creative writing have been achieved, exemplified by the model's ability to generate executable front-end code and simulate classical literary styles [2] Industry Impact - The Wenxin Mentor program has expanded to include 835 experts from various sectors, contributing to the model's improvement in logical rigor, professional depth, and creative quality [3] - The launch of Wenxin 5.0 signifies the maturation and practicality of native multimodal technology, enhancing China's competitive edge in the global AI industry [3] - As of January 15, Wenxin 5.0 ranked first in the domestic text leaderboard and eighth globally in the LMArena, outperforming several mainstream models [3]
2.4万亿参数原生全模态大模型 百度文心5.0正式版上线
Zheng Quan Shi Bao Wang· 2026-01-22 14:06
Core Insights - Baidu has officially launched the native multimodal large model Wenxin 5.0, featuring 2.4 trillion parameters and supporting various input and output formats including text, images, audio, and video [1][2] - The model employs a unified autoregressive architecture for native multimodal modeling, allowing for joint training of multiple data types, which enhances the integration and optimization of multimodal features [1] - Wenxin 5.0 has achieved significant breakthroughs in multimodal understanding, coding, and creative writing capabilities, showcasing its advanced intelligence and tool utilization [1][2] Technical Features - The model utilizes a large-scale mixture of experts structure with ultra-sparse activation parameters, maintaining strong performance while improving inference efficiency [1] - It incorporates end-to-end multi-round reinforcement learning training based on long-term task trajectory data, significantly enhancing the model's agent and tool calling capabilities [1] Market Position - Wenxin 5.0's launch signifies the maturation and practicality of native multimodal technology, reflecting the independent innovation capabilities of Chinese model manufacturers in the global AI industry [2] - As of January 15, Wenxin 5.0 ranked first in China and eighth globally in the LMArena text leaderboard, surpassing several mainstream models including GPT-5.1-High and Gemini-2.5-Pro [3]
百度发布文心5.0:参数规模超2.4万亿,用户可直接体验
Feng Huang Wang· 2025-11-13 04:37
依托飞桨深度学习框架,文心5.0采用了超稀疏混合专家架构,进行庞大的全模态训练,总参数规模超 过2.4万亿,激活参数比例低于3%,在保持模型强大能力的同时有效提升推理效率。 目前,文心大模型5.0 Preview已同步上线文心App,用户可直接体验;开发者和企业用户也可通过百度 千帆大模型平台,调用文心大模型5.0 API服务。 文心5.0基础能力全面升级,在多模态理解、指令遵循、创意写作、事实性、智能体规划与工具应用等 方面表现突出,拥有强大的理解、逻辑、记忆和说服力。在40余项权威基准的综合评测中,其语言与多 模态理解能力与Gemini-2.5-Pro、GPT-5-High等模型持平,图像与视频生成能力与垂直领域专精模型相 当,达到全球领先水平,验证了原生全模态大模型的能力和潜力。 百度首席技术官王海峰介绍,文心大模型5.0是新一代原生全模态大模型。不同于业界多数的多模态模 型采用后期融合的方式,文心5.0的技术路线是采用统一的自回归架构进行原生全模态建模,理解与生 成一体化。 凤凰网科技讯,11月13日,在2025百度世界大会上,百度正式发布文心大模型5.0。据介绍,该模型参 数量达2.4万亿,采用原生 ...