Workflow
事关大模型,盘古团队声明
新华网财经·2025-07-06 06:43

Core Viewpoint - Huawei's announcement of the open-source Pangu Pro MoE model with 720 billion parameters and the Pangu model with 70 billion parameters has sparked industry discussions regarding the similarities between the Pangu Pro MoE model and Alibaba's Tongyi Qwen-2.5 14B model in terms of parameter structure [1][2]. Group 1 - The Noah's Ark Lab stated that the Pangu Pro MoE model is developed and trained based on the Ascend hardware platform and is not an incremental training of other vendors' models [1]. - A GitHub study found a high correlation of 0.927 in attention parameter distribution between the Pangu Pro MoE model and the Tongyi Qwen-2.5 model, indicating significant structural similarities [1]. - Noah's Ark Lab clarified that while some code implementations of the Pangu Pro MoE model reference industry open-source practices, they strictly adhere to open-source license requirements and clearly mark copyright statements [1][2]. Group 2 - Industry analysts suggest that the Pangu Pro MoE model likely did not use the pre-trained weights of the Tongyi Qwen-2.5 model as initialization parameters, as there are essential differences in the absolute value distribution of biases [2]. - The structural consistency between the two models may stem from shared architectural design principles, which is common in large models as good structures are widely adopted [2]. - Noah's Ark Lab emphasized that the Pangu Pro MoE model features key innovations, being the first mixed expert model designed for the Ascend hardware platform, and introduced a grouped mixed expert model (MoGE) architecture to enhance training efficiency [2]. Group 3 - Noah's Ark Lab expressed gratitude to global developers and partners for their support of the Pangu model and highlighted the importance of constructive feedback from the open-source community [3]. - The lab aims to optimize model capabilities through collaboration with like-minded partners, accelerating technological breakthroughs and industry applications [3].