Workflow
华为芯片,让英伟达黄教主坐不住了

Core Viewpoint - Huawei's Ascend CloudMatrix 384 super node has demonstrated performance that surpasses NVIDIA's products in certain aspects, indicating a significant advancement in domestic AI chip capabilities [1][11][13]. Group 1: Huawei's Ascend Chip Overview - Ascend is a dedicated AI processing chip (NPU) designed specifically for AI tasks, with the Ascend 910 being its main product [3][6]. - Previously, Ascend chips were used as backup options due to the unavailability of high-end NVIDIA and AMD chips, but they have now emerged as leaders in the domestic chip market [3][6]. - The Ascend chips have primarily been utilized in AI inference tasks, with limited use in model training due to performance and ecosystem limitations [4][6]. Group 2: Performance and Capabilities - In 2024 and 2025, Huawei transformed Ascend from a backup option to a primary player capable of training large models, achieving significant results documented in research papers [5][6]. - Ascend has successfully trained models with 135 billion parameters using 8192 chips and 718 billion parameters using over 6000 chips, showcasing the ability to train large-scale models with domestic chips [6][10]. - Key performance indicators such as MFU (Modeling Function Utilization) reached over 50% for the dense model and 41% for the MoE model, indicating high efficiency in resource utilization [9][10]. Group 3: Competitive Comparison with NVIDIA - In direct comparisons, Ascend's 384 super node demonstrated comparable performance to NVIDIA's H100 and H800 in real-world applications, achieving the best utilization rates [11][12]. - Although a single Ascend chip's performance is only one-third of NVIDIA's Blackwell, the overall system performance of the 384 super node exceeds NVIDIA's GB200 due to the higher number of chips used [13][21]. - This indicates that Ascend is not just a replacement but has the potential to lead in certain performance metrics [13]. Group 4: Technological Innovations - The CloudMatrix 384 super node consists of 384 Ascend 910 chips and 192 Kunpeng CPUs, interconnected using advanced optical communication technology, which enhances data transmission efficiency [16][30]. - Huawei's approach focuses on a system-level engineering breakthrough rather than relying on single-chip performance, utilizing a combination of communication, optical, thermal, and software innovations [21][22]. - The architecture allows for high-speed, peer-to-peer communication among chips, significantly improving data transfer rates compared to traditional copper connections used by competitors [28][30]. Group 5: Market Position and Future Outlook - Despite still trailing behind NVIDIA in chip technology and software ecosystem, Huawei's Ascend has gained traction in the Chinese market, especially as companies adapt to domestic chips due to restrictions on NVIDIA products [36][38]. - The domestic semiconductor industry is evolving under pressure, with Huawei's strategy representing a unique "technology curve" that prioritizes system optimization over individual chip performance [38][39]. - The advancements made by Ascend may signify the beginning of a significant shift in the AI computing landscape, positioning domestic capabilities for a potential resurgence in the global market [40].