华为大模型身陷“抄袭门”,自研边界争议再起
3 6 Ke·2025-07-10 10:04

Core Viewpoint - The controversy surrounding Huawei's Pangu model involves allegations of plagiarism from Alibaba's Qwen-2.5 model, with a reported average correlation coefficient of 0.927 between their attention parameter distributions, suggesting potential incremental training rather than original development [1][10][11]. Group 1: Allegations and Responses - A GitHub user, HonestAGI, claimed that the Pangu Pro MoE model's attention parameters closely resemble those of Alibaba's Qwen-2.5 model, raising concerns about the originality of Huawei's model [1]. - A whistleblower from Huawei's Pangu team published a blog post titled "The Woes of Pangu," further escalating the public discourse around the allegations [3]. - Huawei's Noah's Ark Lab issued a statement asserting that the Pangu Pro MoE model was developed on its Ascend hardware platform and not based on other vendors' models, highlighting key innovations in architecture [5][8]. Group 2: Technical Aspects and Innovations - The Pangu Pro MoE model introduces a novel Grouped Mixture of Experts (MoGE) architecture, addressing load balancing challenges in large-scale distributed training and enhancing training efficiency [5][8]. - The model's development involved referencing industry open-source practices, with clear copyright attributions in compliance with open-source licenses [5][8][9]. Group 3: Industry Perspectives and Consensus - The "model fingerprint" technique used by HonestAGI to assess similarity between models has not gained widespread acceptance in the industry, with experts arguing that high similarity can occur due to shared knowledge bases rather than direct copying [10][13]. - The ongoing debate highlights the challenges in determining copyright and intellectual property theft in large language models, especially as training costs rise and model reuse becomes common [14][22]. - Industry consensus suggests that models built on open-source foundations should not be labeled as entirely self-developed, as the barriers to creating foundational models are significant [21].