Workflow
CloudMatrix 384超节点技术
icon
Search documents
昇腾“算力突围战”:让中国算力训练出全球一流模型
第一财经· 2025-06-18 12:16
Core Viewpoint - Huawei is leveraging a "system engineering" approach to address its chip technology challenges and enhance its AI computing capabilities, despite being one generation behind in single-chip technology compared to the US [1][4][11]. Group 1: Chip Development and AI Capabilities - Huawei's founder Ren Zhengfei highlighted the company's progress in chip development, emphasizing the use of mathematical optimization and cluster computing to achieve competitive results [1][4]. - The company has made significant advancements in AI computing, with the Ascend chip at the core of its strategy, aiming to position itself favorably in the global computing ecosystem [1][4]. - Huawei's Ascend 72B model achieved a notable performance milestone, ranking first domestically among models with over 100 billion parameters, showcasing its capability to compete with larger models [9][10]. Group 2: System Engineering Approach - The concept of "system engineering" is central to Huawei's strategy, allowing the company to optimize its resources and capabilities across various departments to overcome technological limitations [4][6][7]. - Huawei has established over 86 laboratories, each focusing on specific technological areas, which collectively enhance the company's research and innovation efforts [7]. - The "算力会战" (computing power battle) initiative involves a cross-departmental team of over 10,000 engineers working collaboratively to tackle engineering challenges in AI and chip performance [6][8]. Group 3: Breakthroughs in Computing Power - Huawei's CloudMatrix 384 supernode technology allows for the integration of 384 Ascend computing cards into a single supernode, significantly enhancing computing power and efficiency [11][12]. - The supernode technology transforms computing power from a luxury to a more accessible resource, addressing global concerns about computing power availability [11][12]. - Huawei's approach to optimizing communication and resource allocation within its supernode architecture has led to substantial improvements in overall system performance [13][14][15]. Group 4: Open Ecosystem and Future Directions - Huawei is committed to an increasingly open ecosystem for its Ascend platform, aiming to enhance compatibility and collaboration within the AI community [16][18]. - The company is actively working to address the shortage of high-quality foundational operators by supporting open-source models and enabling clients to develop tailored algorithms [18][19]. - Huawei believes that empowering various industries with AI technology is essential for unlocking transformative potential and achieving competitive advantages in the global market [19][20].
打破美国AI算力限制,华为云发布超节点技术,重塑全球算力格局
Qi Lu Wan Bao· 2025-05-15 12:29
Core Viewpoint - Huawei's CloudMatrix 384 super node technology signifies a breakthrough in China's computing capabilities, demonstrating that technological blockades cannot hinder the country's advancements in AI and computing power [1][3][8]. Group 1: Technological Advancements - Huawei's history of overcoming challenges is marked by significant technological breakthroughs, including the launch of its self-developed AI chip in 2023, which competes with Nvidia's A100 [3]. - The CloudMatrix 384 super node, consisting of 384 Huawei AI chips, achieves a performance surpassing Nvidia's H100, with a throughput of 1920 Tokens/s compared to H100's 1850 Tokens/s [5]. - The architecture of CloudMatrix 384 employs a fully peer-to-peer interconnect bus technology, achieving an inter-card bandwidth of 2.8 Tbps and training efficiency at 90% of single-card performance [5]. Group 2: Market Impact - In Q1 2025, Huawei's AI chip market share in China reached 38%, while domestic production of AI chips surged, with imports dropping by 60% and local shipments increasing by 180% [6]. - The adoption of Huawei AI chips in China's government and enterprise sectors has surpassed 50%, with 70% of equipment in local intelligent computing centers utilizing Huawei technology [6]. - Southeast Asian countries, including Malaysia and Thailand, have begun signing cooperation agreements with Huawei for computing power, with Penang's packaging plant expected to meet 30% of global AI inference demand by 2026 [6]. Group 3: Energy Efficiency and Cost Reduction - The implementation of liquid cooling technology in data centers has reduced the Power Usage Effectiveness (PUE) to 1.1, cutting energy consumption by 40%, with a total power consumption of only 172.8 kW for a single cluster [5]. - The training costs associated with Huawei's technology have decreased by 75% compared to three years ago, thanks to the integration of the open-source MindSpore framework across over 3000 application scenarios [5]. Group 4: Strategic Positioning - The release of CloudMatrix 384 reflects a shift in the logic of computing competition, moving from single-point breakthroughs to system-level leadership in AI infrastructure [8]. - Huawei's advancements are seen as a response to the U.S. sanctions, effectively breaking the "digital Berlin Wall" and establishing a parallel ecosystem based on self-developed technologies [8].