大模型开源 - filings, earnings calls, financial reports, news - Reportify

大模型开源

Search documents

从文心开源谈起，论大模型发展新生态

AI科技大本营· 2025-06-30 09:52

Core Viewpoint - Baidu has officially announced the open-source release of the ERNIE 4.5 series model, marking a significant step in the development of domestic large models and enhancing its position in the AI ecosystem [1] Group 1: Model Details - The ERNIE 4.5 series includes a MoE model with 47 billion and 3 billion active parameters, as well as a dense model with 0.3 billion parameters, with complete open-source pre-training weights and inference code [1] - The new multi-modal heterogeneous model structure proposed by the ERNIE team allows for cross-modal parameter sharing, enhancing multi-modal understanding while maintaining dedicated parameter spaces for individual modalities [1] Group 2: Industry Impact - Baidu's open-source initiative positions it as a key player in the global AI development community, aiming to make the "Wenxin" model a representative of domestic large models that developers can effectively utilize [1] - The open-source release is seen as a response to the evolving landscape of AI, where companies are exploring ways to transition AI from laboratory settings to practical applications in everyday life [5] Group 3: Expert Insights - A panel discussion featuring industry experts will delve into the implications of Baidu's open-source strategy, the future of large models, and the competitive landscape of AI technology [2][3][4]

大模型开源

Artificial Intelligence

文心大模型4.5系列

大模型开源

Artificial Intelligence

文心大模型4.5系列

华为首个开源大模型来了！Pro MoE 720亿参数，4000颗昇腾训练

Hua Er Jie Jian Wen· 2025-06-30 07:27

Core Insights - Huawei has announced the open-sourcing of its Pangu models, including the 70 billion parameter dense model and the 720 billion parameter mixture of experts (MoE) model, marking a significant step in the domestic large model open-source competition [1][3][20] Model Performance - The Pangu Pro MoE model achieves a single-card inference throughput of 1148 tokens/s on the Ascend 800I A2, which can be further enhanced to 1528 tokens/s using speculative acceleration technology, outperforming similar-sized dense models [3][11] - The Pangu Pro MoE model is built on the MoGE architecture, with a total parameter count of 720 billion and an active parameter count of 160 billion, optimized specifically for Ascend hardware [4][11] Training and Evaluation - Huawei utilized 4000 Ascend NPUs for pre-training on a high-quality corpus of 13 trillion tokens, divided into general, inference, and annealing phases to progressively enhance model capabilities [11] - The Pangu Pro MoE model has demonstrated superior performance in various benchmarks, including achieving a score of 91.2 in the DROP benchmark, closely matching the best current models [12][14] Competitive Landscape - The open-sourcing of Pangu models coincides with a wave of domestic AI model releases, with leading companies like MiniMax and Alibaba also upgrading their open-source models, leading to a price reduction of 60%-80% for large models [3][20] - The Pangu Pro MoE model ranks fifth in the SuperCLUE Chinese large model benchmark, surpassing several existing models and indicating its competitive position in the market [17][18] Technological Integration - Huawei's ecosystem, integrating chips (Ascend NPU), frameworks (MindSpore), and models (Pangu), represents a significant technological achievement, providing a viable high-performance alternative to Nvidia's dominance in the industry [20]

大模型开源

Artificial Intelligence

盘古大模型

大模型开源

Artificial Intelligence

盘古大模型

刚刚，华为发布！

中国基金报· 2025-06-30 04:05

【导读】华为首次开源盘古大模型，包含 70 亿和 720 亿参数模型中国基金报记者张燕北 6 月 30 日，华为宣布开源盘古 70 亿参数的稠密模型、盘古 Pro MoE 720 亿参数的混合专家模型，以及基于昇腾的模型推理技术。华为表示，此举是华为践行昇腾生态战略的又一关键举措，推动大模型技术的研究与创新发展，加速推进人工智能在千行百业的应用与价值创造。据华为官网信息，此次是华为首次将盘古大模型的核心能力开源，本次开源主要包括：盘古 Pro MoE 72B 模型权重、基础推理代码，已正式上线开源平台；基于昇腾的超大规模 MoE 模型推理代码，已正式上线开源平台；盘古 7B 相关模型权重与推理代码将于近期上线开源平台。华为表示， " 我们诚邀全球开发者、企业伙伴及研究人员下载使用，反馈使用意见，共同完善。 " （来源：开源开发者平台 GitGo ）据了解，盘古是华为推出的一系列超大规模人工智能预训练模型，涵盖自然语言处理、计算机视觉、科学计算等多个领域。其名称寓意 " 开天辟地 " ，象征着华为在人工智能基础研究和行业应用上的突破性探索。盘古模型自发布以来，已在多个行业中实现落地，包括 ...

大模型开源

盘古大模型

大模型开源

盘古大模型

华为缘何开源盘古大模型？

Tai Mei Ti A P P· 2025-06-30 03:23

Core Insights - Huawei officially announced the open-sourcing of the Pangu 70 billion parameter dense model and the Pangu Pro MoE 720 billion parameter mixture of experts model, marking a significant step in its Ascend ecosystem strategy aimed at advancing AI technology and its applications across various industries [2][3]. Group 1: Open-Sourcing Details - The Pangu Pro MoE 72B model weights and basic inference code are now available on the open-source platform, with the Pangu 7B model weights and inference code expected to be released soon [2]. - This is Huawei's first announcement of open-sourcing the Pangu large models, emphasizing the concept of "open for ecology" to foster technological growth [2][3]. Group 2: Strategic Implications - Huawei's decision to open-source only two widely used models reflects a cautious approach, focusing on models that are moderately parameterized and have balanced performance, suitable for applications like intelligent customer service and knowledge bases [2][3]. - The Pangu Pro MoE model, with its sparse activation and dynamic routing features, is better suited for more complex tasks, indicating a strategic choice in model selection [2]. Group 3: Ecosystem Development - The open-sourcing of the Ascend-based model inference technology is crucial for enhancing the adaptability of domestic AI infrastructure, which is essential for developers to effectively utilize Pangu models [3][4]. - Huawei aims to create a closed-loop system from models to hardware to application scenarios, enhancing its full-stack AI capabilities and ensuring a competitive edge in the market [4]. Group 4: Market Positioning - The launch of the new generation of Ascend AI cloud services based on the CloudMatrix 384 super-node architecture was announced, further solidifying Huawei's position in the AI computing market [3][4]. - The integration of Pangu models with Ascend chips is designed to embed Huawei's hardware deeply into the AI industry chain, similar to how NVIDIA's CUDA ecosystem supports large models [4].

大模型开源

Software and IT Services

盘古大模型

大模型开源

Software and IT Services

盘古大模型

百度正式开源文心大模型4.5系列模型

第一财经· 2025-06-30 03:12

Core Viewpoint - Baidu has officially open-sourced the Wenxin large model 4.5 series, which includes a mix of models with 47 billion and 3 billion active parameters, as well as a dense model with 0.3 billion parameters, totaling 10 models available for use [1] Summary by Sections - The Wenxin large model 4.5 series is now available for download and deployment on platforms such as PaddlePaddle Star River Community and Hugging Face [1] - The open-sourced models come with complete pre-training weights and inference code, enhancing accessibility for developers and researchers [1] - API services for the open-sourced models can be utilized on Baidu's intelligent cloud Qianfan large model platform [1]

大模型开源

文心大模型4.5系列模型

大模型开源

文心大模型4.5系列模型

腾讯，大动作！

中国基金报· 2025-06-27 15:00

Core Viewpoint - Tencent's Hunyuan-A13B model is the first open-source MoE model at the 13B parameter level, offering significant performance improvements and cost advantages for developers in the AI industry [4][6]. Group 1: Model Features and Performance - Hunyuan-A13B has a total of 80 billion parameters, with 13 billion active parameters, outperforming other leading open-source models in terms of inference speed and cost-effectiveness [4][5]. - The model supports flexible thinking modes, allowing for either quick, efficient outputs or deeper, more comprehensive reasoning processes [5]. - It is user-friendly for individual developers, requiring only a single mid-range GPU for deployment, and integrates seamlessly with mainstream open-source inference frameworks [5][10]. Group 2: Industry Trends and Open Source Movement - The open-source trend in AI is accelerating, with major tech companies like OpenAI, Google, and Alibaba releasing over 10 open-source models since March 2023 [8][9]. - The performance of open-source models continues to improve, with platforms like Hugging Face frequently updating their model rankings [8]. - Companies are increasingly adopting open-source AI technologies, with over 50% of enterprises reportedly utilizing these solutions for data, models, and tools [9][10]. Group 3: Future Developments - Tencent plans to release more models of varying sizes and features, contributing to the growth of the open-source ecosystem [6][10]. - Future releases will include a range of mixed reasoning models from 0.5B to 32B parameters, as well as multi-modal foundational models for images, videos, and 3D [10].

TENCENT(HK:00700)

大模型开源

大模型开源

腾讯公司公关总监张军：腾讯混元大模型将持续开源，接下来会有多个尺寸的模型进入开源大家庭。

news flash· 2025-05-21 02:25

腾讯公司公关总监张军：腾讯混元大模型将持续开源，接下来会有多个尺寸的模型进入开源大家庭。 ...

TENCENT(HK:00700)

大模型开源

腾讯混元大模型

大模型开源

腾讯混元大模型

DeepSeek和李飞飞之后，英伟达也看上阿里千问？

Xin Lang Ke Ji· 2025-05-13 07:01

要说全球开源大模型生态圈里，谁最让人"魂牵梦绕"？阿里，当仁不让。就在上周，继DeepSeek和"AI教母"李飞飞之后，英伟达也相中阿里了。除了在最新的"混合推理模型"千问3宣布开源当日，火速官宣接入适配后，5月9日，英伟达还开源了全新的代码推理模型Open Code Reasoning （后续简称：OCR），包括7B、14B、32B三种尺寸，基础模型用的都是通义千问。在LiveCodeBench 评测中，成功超越Open AI 公司o3-mini和o1模型的英伟达OCR-Qwen-32B-Instruct模型，正是基于Qwen2.5-32B微调形成的。在通义千问已经迭代至3.0版本，模型性能再度突破的当下，英伟达居然还基于上一代千问模型做出了比肩全球一流水平的模型，让人不禁想问，千问到底还有多少隐藏潜力待各方解锁？ DeepSeek、李飞飞后，英伟达也相中了"通义千问" 目前，英伟达开源的OCR系列模型的代码及数据集，已公开分享至全球最大AI开源社区Hugging Face平台上，供开发者们免费浏览学习。其中，英伟达OCR-Qwen-32B-Instruct在LiveCodeBench ...

大模型开源

英伟达OCR系列模型

大模型开源

英伟达OCR系列模型

访清华孙茂松：中国“强音”推大模型开源，全球大模型文化正在扭转

Huan Qiu Wang Zi Xun· 2025-04-30 08:51

中新网北京4月30日电 (记者夏宾)清华大学人工智能研究院常务副院长、欧洲科学院外籍院士孙茂松近日在北京接受中新网记者专访时称，中国科技公司在大模型领域掀起的开源浪潮向全球发出了中国"强音"，其技术在获得国际认可的同时，悄然扭转了全球大模型文化。来源：中国新闻网最新消息显示，4月29日凌晨，新一代通义千问模型Qwen3(千问3)宣布开源，总共涉及8款不同尺寸的千问3模型。据悉，阿里通义已开源200余个模型，全球下载量超3亿次，其衍生模型数超10万个，超越美国Llama，成为全球第一开源模型。以DeepSeek、Qwen为代表的中国开源模型实现先进模型的参数权重、推理逻辑和工具链条的全开源，正在打开人工智能商用的新局面。 "尽管DeepSeek总体上是一个'从1到2'的创新，但在人工智能反馈强化学习方面是开源大模型中走得最远的，将人类反馈变成了人工智能反馈。"谈到DeepSeek时，孙茂松说。孙茂松特别强调了小模型的重要价值。从应用的角度，小模型可降低成本，拓展应用的普及度；从研究的角度，小模型可有助于高校科研机构应对资源约束带来的研究挑战，这些都有很强的必要性。在他看来，大模型做得越 ...

大模型开源

Artificial Intelligence

通义千问模型Qwen3(千问3)

大模型开源

Artificial Intelligence

通义千问模型Qwen3(千问3)

（经济观察）中国大模型密集开源影响几何？

Zhong Guo Xin Wen Wang· 2025-03-25 16:39

Core Insights - The trend of open-sourcing large models in China is rapidly gaining momentum, with major companies like Alibaba Cloud and DeepSeek leading the charge by releasing multiple models in a short time frame [1][2][3] Group 1: Market Dynamics - The demand for edge intelligence is rising, driven by the need for personal AI deployment, which is accelerating the development of edge intelligence [2] - There is a significant increase in AI deployment needs across various industries, with open-source models providing the flexibility and customization required for differentiated business scenarios and data privacy [2][3] - As of March 25, the global download count for Alibaba's Qwen series of open-source models has exceeded 200 million, indicating widespread adoption across sectors such as healthcare, education, finance, and transportation [2] Group 2: Industry Ecosystem - The AI industry is entering a phase of accelerated ecosystem development, characterized by clearer upstream and downstream collaboration, with leading companies focusing on model capabilities while smaller firms develop niche applications based on open-source models [2][3] - The number of derivative models from Alibaba's open-source models has surpassed 100,000, making it the largest open-source model family globally [3] Group 3: Future Outlook - Experts suggest that open-source models will become a powerful engine driving the development of AI in China, with recommendations for a more proactive embrace of open-source initiatives at all levels, including government and enterprises [4][5] - The open-source approach not only fosters competition among tech companies but also accelerates AI adoption and innovation by reducing costs and opening doors for product innovation [4][5]

大模型开源

DeepSeek - V3模型

Step - Video - TI2V

大模型开源

DeepSeek - V3模型

Step - Video - TI2V