Workflow
全栈国产化
icon
Search documents
华为首个!重磅发布!
证券时报· 2025-06-30 04:12
Core Viewpoint - Huawei's announcement to open source the Pangu 70 billion parameter dense model and the 720 billion parameter mixture of experts model (Pangu Pro MoE 72B) is a significant step in promoting the development and application of large model technology across various industries, aligning with its Ascend ecosystem strategy [1][7]. Group 1: Model Specifications and Performance - The newly open-sourced Pro MoE 72B model, with 720 billion parameters and 160 billion active parameters, demonstrates exceptional performance that can rival models with over a trillion parameters, according to the latest Super CLUE rankings [3][4]. - Huawei's Pangu Ultra MoE model, launched on May 30, features a parameter scale of 718 billion, showcasing advancements in training performance on the Ascend AI computing platform [4][5]. Group 2: Strategic Implications - The release of these models signifies Huawei's capability to create world-class large models based on its Ascend architecture, achieving a fully controllable training process from hardware to software [5]. - Huawei's unique approach in the large model strategy emphasizes practical applications across various industries, aiming to solve real-world problems and accelerate the intelligent upgrade of numerous sectors [5][7]. Group 3: Industry Impact - The Pangu large models have been implemented in over 30 industries and 500 scenarios, providing significant value in sectors such as government, finance, manufacturing, healthcare, and autonomous driving [5]. - The open-sourcing initiative is expected to attract more developers and vertical industries to create intelligent solutions based on the Pangu models, further enhancing the integration of AI across different fields [7].
华为,重大发布!
新华网财经· 2025-06-20 12:17
Core Viewpoint - Huawei's Pangu model has made significant advancements in various industries, demonstrating its capabilities in over 30 industries and 500 scenarios, with the latest Pangu model 5.5 set to enhance natural language processing and multimodal applications [1][4]. Group 1: Pangu Model Developments - The Pangu model has been successfully implemented in sectors such as government, finance, manufacturing, healthcare, coal mining, steel, railways, autonomous driving, and meteorology, showcasing its transformative impact [1]. - Huawei introduced the Pangu Ultra MoE model with a parameter scale of 718 billion, marking a significant leap in the training of ultra-large-scale models on the Ascend AI computing platform [1][2]. Group 2: Technical Innovations - The Pangu team has innovated in model architecture and training methods, achieving stable training of the ultra-large MoE model on the Ascend platform, utilizing over 18TB of data [2]. - Key innovations include the Depth-Scaled Sandwich-Norm (DSSN) architecture and TinyInit initialization method, which enhance stability and load balancing among experts [2][3]. Group 3: Performance Enhancements - The recent upgrades to the training system have improved the efficiency of the pre-training process, increasing the performance of the model from 30% to 41% in the multi-card cluster pre-training [3]. - The Pangu Pro MoE model, with 72 billion parameters and 16 billion active parameters, has demonstrated performance comparable to models with over 100 billion parameters, ranking first among domestic models under 100 billion parameters [3]. Group 4: HarmonyOS Developments - Huawei unveiled HarmonyOS 6, which aims to enhance user experience with lower latency and improved AI capabilities, marking a significant step in the evolution of the Harmony ecosystem [4]. - The Harmony ecosystem is entering a new phase of acceleration, with over 30,000 applications and services in development across nearly 20 industries, highlighting a significant demand for talent in this area [5].
重磅!华为发布准万亿大模型
Mei Ri Jing Ji Xin Wen· 2025-05-30 11:41
Core Insights - Huawei has launched a new model called Pangu Ultra MoE, which has a parameter scale of 718 billion, marking a significant advancement in the MoE model training field on the Ascend AI computing platform [1][3][6] - The release of Pangu Ultra MoE and the Pangu Pro MoE series demonstrates Huawei's capability in achieving a fully controllable training process for domestic computing power and models, validating the innovation capacity of China's AI infrastructure [3][6] Model Architecture and Training Innovations - The Pangu team has introduced innovative designs in model architecture and training methods to address the challenges of training ultra-large-scale and highly sparse MoE models, achieving stable training on the Ascend platform [1][4] - Key innovations include the Depth-Scaled Sandwich-Norm (DSSN) architecture and TinyInit initialization method, which have enabled long-term stable training with over 18TB of data [4] - The introduction of the EP loss load optimization method ensures better load balancing among experts and enhances their specialization capabilities [4] Performance and Efficiency Improvements - The training methods disclosed by Huawei have enabled efficient integration of large sparse MoE reinforcement learning (RL) post-training frameworks on the Ascend CloudMatrix 384 supernodes [5] - Recent upgrades have improved the pre-training system's performance, increasing the multi-factor utilization (MFU) from 30% to 41% [5] - The Pangu Pro MoE model, with 72 billion parameters and 16 billion active parameters, has demonstrated performance comparable to larger models, ranking first among domestic models under 100 billion parameters in the SuperCLUE leaderboard [5] Industry Implications - The successful training and optimization of ultra-large-scale sparse models on domestic AI platforms signify a closed-loop of "full-stack domestication" and "fully controllable processes" from hardware to software, and from research to engineering [6] - This advancement provides a strong foundation for the development of China's AI industry, reinforcing confidence in domestic AI capabilities [3][6]
重大突破!刚刚,华为发布!
券商中国· 2025-05-30 10:43
Core Viewpoint - Huawei's launch of the Pangu Ultra MoE model, with a parameter scale of 718 billion, signifies a major advancement in China's AI industry, showcasing the capability for independent and controllable training processes on domestic computing platforms [1][4]. Group 1: Breakthroughs in Domestic Computing and Models - The training of ultra-large-scale and highly sparse MoE models is challenging, but Huawei's Pangu team has innovatively designed the model architecture and training methods to achieve stable training on the Ascend platform [2]. - The Pangu team introduced the Depth-Scaled Sandwich-Norm (DSSN) architecture and TinyInit initialization method, enabling long-term stable training with over 18TB of data [2]. - The EP loss optimization method ensures load balancing among experts and enhances their specialization capabilities, while the Pangu Ultra MoE employs advanced MLA and MTP architectures to balance model performance and efficiency [2][3]. Group 2: Training Method Innovations - Huawei's team has disclosed key technologies that enable efficient training of large sparse MoE models on the Ascend CloudMatrix 384 supernodes, marking a transition to a supernode cluster era for reinforcement learning (RL) post-training frameworks [3]. - Recent upgrades to the pre-training system have improved the efficiency of the MFU in large clusters from 30% to 41% [3]. - The Pangu Pro MoE model, with 72 billion parameters and 16 billion active parameters, demonstrates exceptional performance that rivals larger models through innovative dynamic activation of expert networks [3]. Group 3: Industry Developments - DeepSeek's R1 model has completed a minor version upgrade, outperforming Western competitors in several standardized metrics while maintaining a low cost of only a few million dollars [5]. - Tencent's AI model strategy has been fully unveiled, with the Mix Yuan model achieving a ranking among the top eight globally on the Chatbot Arena platform, showcasing its continuous technological advancements [6].