华为盘古大模型

Search documents
国泰海通|产业:华为盘古大模型与昇腾AI计算平台,共同构建软硬一体的AI技术体系
国泰海通证券研究· 2025-08-07 14:15
华为正通过从大模型设计到基础设施的软硬协同,探索构建其全栈 AI 竞争力的路径。 华为 AI 发展策略已逐渐从追赶并对标业界 SOTA 模型,转向为更好地 发挥自研昇腾硬件潜力而量身定做模型架构。这一双向协同进化路径,旨在解决 AI 模型规模化应用中的系统性问题,并构建一个由软硬件协同架构、算子与 软件栈构成的全栈技术体系。 盘古大模型的演进,其核心是为解决大规模分布式系统中的效率难题。 随着大语言模型从稠密架构全面转向混合专家( MoE )稀疏架构,业界普遍面临专家 负载不均衡这一系统性瓶颈,它制约了 MoE 模型在训练和推理中的实际性能。华为将此系统性问题作为其软硬架构创新的核心方向,标志着其关注点已从单 纯硬件或单纯 AI 算法问题,拓展至在自研硬件上更高效解决 AI 系统工程问题。 华为在大模型层面并行推出了两种创新路径。一方面, Pangu Pro MoE 通过架构破局,提出分组专家混合( MoGE )架构,旨在通过结构性设计解决负载不 均衡问题。另一方面 , Pangu Ultra MoE 则通过系统级优化,以仿真先行的设计方法来优化模型架构从而更好的适配昇腾硬件,并通过贯穿训练和推理的协 同优化 ...
大模型“套壳”争议:自研与借力的边界何在?
Sou Hu Cai Jing· 2025-07-17 01:39
Core Viewpoint - The debate over "original research" versus "shell models" in the AI field has intensified, particularly focusing on the similarities between Huawei's Pangu model and Alibaba Cloud's Qwen model [1][2] Group 1: Development and Trends in AI Models - The rise of large models can be traced back to the Transformer architecture released by Google Brain in 2017, with three main types dominating the field: Decoder-only (like GPT), Encoder-Decoder (like T5), and Encoder-only (like BERT) [2] - The launch of ChatGPT in November 2022 based on GPT 3.5 attracted millions of users, marking the entry of large language models (LLMs) into public awareness and prompting many companies to enter the market [2] - The open-source era in 2023 has led to an increase in teams using open-source frameworks for model training, facilitating technological exchange and iteration [1][4] Group 2: Shell Model Controversies - Initial shell model behaviors often involved simple API wrapping without any secondary development, but regulatory scrutiny has increased, leading to penalties for such practices [3] - Despite regulatory actions, shell models continue to emerge, with some models being criticized for having "GPT-like" responses, raising questions about their originality [3][4] - The concept of "data distillation," where a strong "teacher model" generates high-quality data for training a "student model," has gained attention, especially after ByteDance was reported to have used OpenAI's API for data generation [4] Group 3: Open Source and Compliance Issues - The open-source movement has led to debates about whether using open-source model architectures for secondary development constitutes shell modeling, with various opinions on compliance and ethical boundaries [4][8] - A notable incident involved the Yi-34B model, which sparked discussions about compliance with the LLaMA open-source protocol, highlighting the complexities of defining shell models versus original research [5][7] - The lowering of development barriers in the open-source era has resulted in both positive advancements and negative shell behaviors, prompting ongoing discussions about the moral and legal implications of such practices [8][9] Group 4: Industry Perspectives - Major companies may lack foundational training logic and experience in model development, leading them to leverage open-source technologies for quicker advancements [9] - The AI industry recognizes that while using open-source technology is acceptable, it is crucial to provide clear documentation and avoid misrepresenting such efforts as original research [9]
盘古大模型与通义千问,谁抄袭了谁?
阿尔法工场研究院· 2025-07-08 12:22
Core Viewpoint - The controversy surrounding Huawei's Pangu 3.5 and Alibaba's Tongyi Qianwen 1.5-7B models centers on the high correlation score of 0.927 derived from the "LLM-Fingerprint" technology, suggesting potential similarities or derivation between the two models [1][14][16]. Group 1: Technical Analysis - The "LLM-Fingerprint" technology analyzes model responses to specific trigger words, generating a unique identity for each large model [12][11]. - A report indicated that the correlation score of 0.927 between Huawei's Pangu 3.5 and Alibaba's Tongyi Qianwen 1.5-7B is significantly higher than the scores between other mainstream models, which are generally below 0.1 [14][15]. - Huawei's defense against the allegations was deemed unscientific by external observers, as they pointed out that high correlation could also be found among different versions of the Tongyi Qianwen models [19][20]. Group 2: Open Source Culture and Ethics - The debate highlights the tension between "reuse" and "plagiarism" within the AI open-source ecosystem, raising questions about the ethical implications of model development [22][21]. - The high costs associated with developing large models, estimated at $12 million for effective training, make it common practice to build upon existing open-source models [25][26]. - The distinction between "reuse" and "plagiarism" remains ambiguous, particularly regarding model parameters and adherence to open-source licenses [28][29]. Group 3: Competitive Landscape - The incident reflects the intense competition between Huawei and Alibaba in the Chinese AI market, with Alibaba currently serving 90,000 enterprises through its Tongyi series models [37][42]. - Huawei's Pangu model is crucial for its strategy to establish a comprehensive AI ecosystem, while Alibaba has leveraged its cloud infrastructure and open-source ecosystem to gain a competitive edge [32][36]. - The silence from Alibaba's Tongyi Qianwen team amid the controversy suggests a strategic decision to avoid escalating the situation into a public dispute [40][47]. Group 4: Industry Implications - The controversy serves as a "stress test" for the current AI open-source ecosystem, exposing its vulnerabilities and the lag in governance [52]. - The industry is urged to establish clearer rules regarding model citation and derivation standards, akin to plagiarism detection systems in academia [53]. - There is a call for greater transparency in model development processes, including the promotion of "Model Cards" and data transparency [54].
【产业互联网周报】华为盘古大模型被质疑抄袭;AI人才争夺加剧,DeepSeek在海外大举招聘人才;微软被曝将“AI使用量”纳入员工考核,直接挂钩绩效;设...
Tai Mei Ti A P P· 2025-07-08 03:37
Group 1 - Huawei's Pangu team announced the open-source release of the Pangu 7B dense and 72B mixture of experts models, but faced allegations of plagiarism from Alibaba's Tongyi Qwen-2.5 14B model, with a high similarity score of 0.927 in attention parameter distribution [2][3] - Huawei's Noah's Ark Lab responded that the Pangu Pro MoE model was developed and trained on its Ascend hardware platform and not based on other vendors' models [2] - An article published on GitHub by a self-identified member of Huawei's Pangu team claimed that the team fabricated technological breakthroughs and used competitor models for training [3] Group 2 - Tencent responded to user complaints about the new "AI search" feature in WeChat, clarifying that it integrates public information without using user privacy data [4][5] - Baidu announced its largest search business overhaul in a decade, allowing for over 1,000 characters in search queries and integrating AI writing and image generation capabilities [6] Group 3 - The 2025 Global Digital Economy Conference revealed a list of the top 100 talents in the AI field, with a significant representation of Chinese individuals [7] - DeepSeek is reportedly ramping up overseas recruitment, aiming to attract talent for positions focused on artificial general intelligence (AGI) [9] Group 4 - ByteDance has produced over 1,000 robots in two and a half years, with a long-term goal of achieving embodied intelligence [10] - Zhipu AI released and open-sourced the GLM-4.1V-Thinking series, a multimodal model with 9 billion parameters, demonstrating superior performance in various benchmark tests [10] Group 5 - Yonyou Network Technology submitted an H-share listing application to the Hong Kong Stock Exchange, marking a significant step in its internationalization strategy [14] - Wisdom Eye was included in KPMG's inaugural "China Health Technology Top 50" list for its innovative applications in healthcare AI [14] Group 6 - Baidu officially open-sourced the Wenxin large model 4.5 series, which includes various models with different parameter configurations [15] - DingTalk launched over 100 free templates for the e-commerce industry, integrating AI functionalities for various business needs [16] Group 7 - Siemens and other EDA companies confirmed the lifting of U.S. export restrictions on chip design software to China, allowing for renewed access to their technologies [17][18] - Trump announced new tariffs set to take effect on August 1, with rates potentially reaching up to 70% [19] Group 8 - Microsoft is set to lay off nearly 9,000 employees as part of a restructuring plan aimed at optimizing processes and reducing management layers [20] - Elon Musk's xAI company completed a $10 billion funding round to further develop its AI solutions and data centers [20] Group 9 - Google announced the global availability of its latest AI video generation model, Veo3, which significantly enhances video production capabilities [21] - CoreWeave became the first AI cloud service provider to deploy NVIDIA's GB300 NVL72 system, boasting high AI performance [22] Group 10 - Cursor apologized for a pricing communication issue regarding its Pro Plan and offered refunds to affected users [23] - Cursor's developer Anysphere hired two former executives from Anthropic to strengthen its leadership team [25] Group 11 - Microsoft is incorporating AI usage into employee performance evaluations, reflecting its commitment to integrating AI tools into daily operations [26] - Apple is considering using AI technologies from Anthropic or OpenAI for its Siri assistant, indicating a potential shift in its AI strategy [27] Group 12 - Meta established a new department called the "Meta Superintelligence Lab," recruiting several prominent figures from the AI industry [28] - Multiple European companies urged the EU to pause the implementation of the upcoming AI Act, citing concerns over its impact on innovation [29] Group 13 - Figma submitted its IPO application, aiming to list on the NYSE, following a previous failed acquisition attempt by Adobe [31] - Remark completed a $16 million Series A funding round to expand its online retail guidance services [32] Group 14 - Zhiyu Technology went public in Hong Kong, raising approximately 320 million HKD for research and international market expansion [37] - Domestic GPU company Sunrise raised nearly 1 billion RMB in funding to support its high-performance GPU development [38]
华为回应盘古大模型抄袭;DeepSeek 在海外招聘;马斯克宣布成立“美国党”,明年参加大选|AI 周报
AI前线· 2025-07-06 04:03
Core Viewpoint - The article discusses various developments in the AI industry, including controversies surrounding Huawei's Pangu model, recruitment efforts by DeepSeek, and significant personnel changes in major tech companies like ByteDance and Microsoft. Group 1: Huawei and AI Models - Huawei's Pangu team responded to allegations of plagiarism regarding their open-source models, claiming that their MoE model is based on their own development and not on other companies' models [1][2] - The Pangu models include various parameter specifications, such as the Pangu E series for mobile applications and the Pangu S series for super-large models, aimed at enhancing AI technology applications across different sectors [5] Group 2: Recruitment and Personnel Changes - DeepSeek has recently begun recruiting overseas talent, indicating a strategic move to attract skilled professionals in the AI field [6][7] - ByteDance's AI product lead, Wang Xuan, has left the company to pursue a new venture in AI hardware, with backing from a prominent investment firm [8] - The core product lead of the AI programming project "Xinyan Yima" has secured new funding, doubling the company's valuation to several hundred million USD [9] Group 3: Microsoft and AI Integration - Microsoft announced a second round of layoffs affecting approximately 9,000 positions, with a focus on cost control and streamlining operations [11][12] - The company is integrating AI usage into employee performance evaluations, emphasizing the importance of AI tools in daily operations [12][13] Group 4: Other Industry Developments - Apple is considering using AI technologies from Anthropic or OpenAI for Siri, potentially sidelining its internal models [13] - The U.S. has lifted export restrictions on EDA software to China, allowing major chip software companies to resume supply [16] - AMD's CEO has received a significant salary increase and stock options, reflecting the company's strong market position [17] - ByteDance has reportedly produced over 1,000 robots, focusing on logistics applications and aiming for advancements in embodied intelligence [18][19]
为什么 DeepSeek 大规模部署很便宜,本地很贵
AI前线· 2025-07-04 06:10
Core Insights - The article discusses the trade-off between throughput and latency in AI inference services, particularly focusing on models like DeepSeek-V3, which are said to be fast and cheap at scale but slow and expensive when run locally [1][12]. - It highlights the importance of batch processing in improving GPU efficiency, where larger batch sizes can lead to higher throughput but increased latency due to waiting for the batch to fill [2][12]. Batch Processing and GPU Efficiency - Batch processing allows multiple tokens to be processed simultaneously, leveraging the GPU's ability to perform large matrix multiplications efficiently [3][4]. - The efficiency of GPUs is maximized when executing large matrix multiplications in a single command, reducing overhead and memory access times compared to multiple smaller operations [4][12]. - In inference servers, a "collect window" is used to queue user requests, balancing the need for low latency (5-10 milliseconds) against the benefits of higher throughput with larger batch sizes [5][12]. Expert Mixture Models and Pipeline Efficiency - Expert mixture models, like DeepSeek-V3, require larger batch sizes to maintain GPU efficiency, as they involve multiple independent weight blocks that can lead to low throughput if not properly batched [6][12]. - Large models with many layers need to avoid "pipeline bubbles" by ensuring that the batch size exceeds the number of layers in the pipeline, which can otherwise lead to inefficiencies and increased latency [8][12]. - The article notes that maintaining a full queue is challenging due to the need for sequential processing of tokens, which complicates the batching of requests from the same user [9][10]. Implications for Inference Providers - Inference providers must choose batch sizes that optimize throughput while managing latency, as larger batch sizes can lead to significant delays for users waiting for their tokens to be processed [12]. - The performance of models from companies like OpenAI and Anthropic suggests they may utilize more efficient architectures or advanced inference techniques to achieve faster response times compared to models like DeepSeek [12].
全国首例!深圳龙岗智慧教育AI平台率先接入华为盘古大模型
Nan Fang Du Shi Bao· 2025-07-02 08:58
Core Insights - Huawei has announced the open-source release of its Pangu model with 7 billion parameters and the Pangu Pro MoE model with 72 billion parameters, alongside model inference technology based on Ascend [1] - Longgang District Education Bureau in Shenzhen has become the first government department in China to deploy the open-source Pangu model, marking a significant step in the "AI + Education" strategy [1][5] Strategic Foundation - Longgang recognizes the need for a robust, secure, and innovative AI technology to transition from "digitalization" to "intelligentization" in education, making Huawei's Pangu model a necessary choice [3] - The collaboration with Huawei allows Longgang to actively participate in the development of a model tailored to its educational needs, integrating local educational resources and methodologies [3] Industry Collaboration - The Longgang Education Bureau is linking local AI technology resources to ensure the effective implementation of the Pangu model, with several tech companies involved in developing training and optimization solutions [4] - The existing smart education AI platform in Longgang supports various intelligent applications, laying a foundation for AI-enabled education [4] Scenario Empowerment - The introduction of the Pangu model aims to reshape the educational ecosystem, benefiting both teachers and students [6] - Teachers will receive support in lesson preparation, grading, and personalized assignments, allowing them to focus on creative teaching [6] - Students will have access to a "one-on-one intelligent companion" for 24/7 assistance, personalized learning paths, and comprehensive growth tracking [6] Future Vision - Longgang's initiative to adopt the Pangu model is a key move in its "All in AI" city strategy, showcasing a model of technological innovation and educational application [8] - The Longgang Education Bureau aims to explore further innovative applications of AI in areas such as moral education and family-school collaboration, focusing on developing future learners with intelligent collaboration and complex judgment skills [8]
华为云CloudRobo亮相:赋能具身智能,不做本体专注平台服务
Sou Hu Cai Jing· 2025-06-23 22:54
Core Insights - The year 2025 is recognized as the "Year of Embodied Intelligence," marked by the launch of Huawei's CloudRobo platform at the HDC 2025 event [1] - Huawei Cloud focuses on platform development while leaving the physical manufacturing of robots to partners, aiming to evolve every connected entity into an intelligent robot [1] Group 1: CloudRobo Platform Features - The CloudRobo platform integrates Huawei's Pangu large model for multimodal processing and cognitive capabilities, creating a complete workflow from data synthesis to cloud-edge collaboration and security supervision [1] - It encompasses three core models: the embodied multimodal generation model, the embodied planning model, and the embodied execution model, significantly accelerating the development of embodied intelligence [1] Group 2: Embodied Models - The embodied multimodal generation model acts as a bridge between the digital and physical worlds, providing diverse training samples for intelligent robots, enhancing data synthesis realism and efficiency [1] - The embodied planning model, referred to as the "embodied brain," enables robots to understand spatial awareness and complex reasoning, capable of planning tasks with over ten steps [2] - The embodied execution model, known as the "small brain," focuses on precise control and generalization abilities, achieving millimeter-level control precision in industrial applications [4] Group 3: Industrial Applications - The CloudRobo platform has expanded its applications in various industrial sectors, including precise operations in fiber optics with a success rate exceeding 90% [4] - In the spraying sector, it helps Efort robotic arms quickly adapt to new tasks, while in automotive manufacturing, Leju robots utilize the platform for efficient logistics and material handling [4] - In semiconductor manufacturing, Youai Zhihuo logistics robots leverage CloudRobo for real-time production scheduling and flexible task planning [4] Group 4: R2C Protocol - Huawei Cloud introduced the R2C (Robot to Cloud) protocol to establish an open, efficient, and secure connection standard between robots and the cloud [4] - The company encourages industry partners and organizations to participate in promoting the R2C protocol, aiming to connect more robotic entities to the intelligent platform and advance embodied intelligence technology [4]
华为云CloudRobo平台:赋能具身智能,不造本体创未来
Sou Hu Cai Jing· 2025-06-23 22:42
Core Insights - The 2025 year is anticipated as the "Year of Embodied Intelligence," with significant technological innovations being unveiled at the Huawei Developer Conference 2025 [1] - Huawei Cloud introduced the CloudRobo platform, marking a strategic move into the field of embodied intelligence, focusing on collaboration with partners rather than manufacturing robots [1] Group 1: CloudRobo Platform Features - The CloudRobo platform integrates end-to-end capabilities from data synthesis to cloud-edge collaborative deployment, leveraging Huawei's Pangu large model for multimodal and cognitive abilities [1] - It consists of three core models: the embodied multimodal generation model, the embodied planning model, and the embodied execution model, which collectively accelerate innovation in embodied intelligence [1][2] Group 2: Model Functions - The embodied multimodal generation model creates a highly realistic digital space that generates vast amounts of data samples, enhancing training efficiency for intelligent robots with minimal data collection [2] - The embodied planning model, referred to as the "embodied brain," enables complex multi-step task planning through spatial perception and environmental interaction understanding [2] - The embodied execution model functions as the "small brain" of intelligent robots, providing high-precision motion control with capabilities in various industrial tasks, achieving millimeter-level control accuracy [2][3] Group 3: Industrial Applications - The CloudRobo platform has been successfully applied in various industrial scenarios, such as a fully robotic flexible assembly system for optical products, achieving over 90% success in delicate operations [3] - In the industrial spraying sector, CloudRobo has empowered Evert's spraying robotic arms to quickly adapt to new tasks, while in automotive manufacturing, it has assisted in logistics and material handling [4] - In semiconductor manufacturing, CloudRobo collaborates with Youai Intelligent to synchronize production scheduling and dynamically update task planning [4] Group 4: Connectivity Standards - Huawei Cloud has proposed the R2C (Robot to Cloud) connectivity protocol to address the challenges posed by diverse robot types, complex sensor types, and varied interface protocols, aiming to establish an open, efficient, and secure connection standard [4]
9位顶级研究员连讲3晚,华为盘古大模型底层研究大揭秘
机器之心· 2025-05-26 10:59
Core Viewpoint - The rapid development of large language models (LLMs) has become a cornerstone of general artificial intelligence systems, but the increase in model capabilities has led to significant growth in computational and storage demands, presenting a challenge for achieving high performance and efficiency in AI [1][2]. Group 1: Technological Advancements - Huawei's Noah's Ark Lab has developed the Pangu Ultra, a general language model with over 100 billion parameters, surpassing previous models like Llama 405B and Mistral Large 2 in various evaluations [2]. - The lab also introduced the sparse language model Pangu Ultra MoE, achieving long-term stable training on over 6000 Ascend NPUs [2]. Group 2: Key Research Presentations - A series of sharing sessions from May 28 to May 30 will cover breakthroughs in quantization, pruning, MoE architecture optimization, and KV optimization, aimed at developers and researchers interested in large models [3][4]. Group 3: Specific Research Contributions - **CBQ**: A post-training quantization framework that addresses the high computational and storage costs of LLMs, achieving significant performance improvements in ultra-low bit quantization [6]. - **SlimLLM**: A structured pruning method that effectively reduces the computational load of LLMs while maintaining accuracy, demonstrating advanced performance in LLaMA benchmark tests [8]. - **KnowTrace**: An iterative retrieval-augmented generation framework that enhances multi-step reasoning by tracking knowledge triplets, outperforming existing methods in multi-hop question answering [10]. Group 4: Further Innovations - **Pangu Embedded**: A flexible language model that alternates between fast and deep thinking, designed to optimize inference efficiency while maintaining high accuracy [14]. - **Pangu-Light**: A pruning framework that stabilizes and optimizes performance after aggressive structural pruning, achieving significant model compression and inference acceleration [16]. - **ESA**: An efficient selective attention method that reduces computational overhead during inference by leveraging the sparsity of attention matrices [18]. Group 5: MoE Model Developments - **Pangu Pro MoE**: A native MoE model with 72 billion parameters, designed to balance load across devices and enhance inference efficiency through various optimization techniques [21]. - **PreMoe**: An expert routing optimization for MoE models that allows dynamic loading of experts based on task-specific requirements, improving inference efficiency by over 10% while maintaining model capability [24]. Group 6: KV Optimization Techniques - **KVTuner**: A hardware-friendly algorithm for KV memory compression that achieves near-lossless quantization without requiring retraining, significantly enhancing inference speed [26]. - **TrimR**: An efficient reflection compression algorithm that identifies redundant reflections in LLMs, leading to a 70% improvement in inference efficiency across various models [26].