华为开源盘古7B稠密和72B混合专家模型

Core Viewpoint - Huawei has officially announced the open-sourcing of its Pangu models, including the 70 billion parameter dense model and the 720 billion parameter mixture of experts (MoE) model, as part of its Ascend ecosystem strategy to promote AI technology across various industries [1][2]. Group 1: Model Development and Open-Sourcing - Huawei has launched the Pangu Pro MoE 72B model weights and basic inference code on an open-source platform, with plans to release the Pangu 7B model weights and inference code soon [1]. - The Pangu Pro MoE model, with 720 billion parameters and 160 billion active parameters, has demonstrated performance comparable to larger models, ranking first among domestic models with fewer than 1 trillion parameters on the SuperCLUE leaderboard [1]. - The company plans to open-source the Pangu 72B MoE model first, followed by smaller models potentially for academic institutions [2]. Group 2: Technical Advancements - Huawei has introduced a new model, the Pangu Ultra MoE, with a parameter scale of 718 billion, trained entirely on the Ascend AI computing platform [2]. - The model training efficiency has been highlighted, with a model compute utilization (MFU) of 41% achieved in pre-training and over 50% in specific configurations [3]. - The architecture of the Ascend super nodes has been optimized for extreme parallelism, enhancing training efficiency and inference performance [3]. Group 3: Ecosystem and Future Plans - Huawei is committed to improving its ecosystem and ensuring compatibility with mainstream industry ecosystems to support customer development [2]. - The company has also announced upgrades to its Pangu models for natural language processing, computer vision, multimodal applications, prediction, and scientific computing at the Huawei Developer Conference [3].