超大规模稀疏模型

Search documents
华为,重大发布!
新华网财经· 2025-06-20 12:17
Core Viewpoint - Huawei's Pangu model has made significant advancements in various industries, demonstrating its capabilities in over 30 industries and 500 scenarios, with the latest Pangu model 5.5 set to enhance natural language processing and multimodal applications [1][4]. Group 1: Pangu Model Developments - The Pangu model has been successfully implemented in sectors such as government, finance, manufacturing, healthcare, coal mining, steel, railways, autonomous driving, and meteorology, showcasing its transformative impact [1]. - Huawei introduced the Pangu Ultra MoE model with a parameter scale of 718 billion, marking a significant leap in the training of ultra-large-scale models on the Ascend AI computing platform [1][2]. Group 2: Technical Innovations - The Pangu team has innovated in model architecture and training methods, achieving stable training of the ultra-large MoE model on the Ascend platform, utilizing over 18TB of data [2]. - Key innovations include the Depth-Scaled Sandwich-Norm (DSSN) architecture and TinyInit initialization method, which enhance stability and load balancing among experts [2][3]. Group 3: Performance Enhancements - The recent upgrades to the training system have improved the efficiency of the pre-training process, increasing the performance of the model from 30% to 41% in the multi-card cluster pre-training [3]. - The Pangu Pro MoE model, with 72 billion parameters and 16 billion active parameters, has demonstrated performance comparable to models with over 100 billion parameters, ranking first among domestic models under 100 billion parameters [3]. Group 4: HarmonyOS Developments - Huawei unveiled HarmonyOS 6, which aims to enhance user experience with lower latency and improved AI capabilities, marking a significant step in the evolution of the Harmony ecosystem [4]. - The Harmony ecosystem is entering a new phase of acceleration, with over 30,000 applications and services in development across nearly 20 industries, highlighting a significant demand for talent in this area [5].