华为云再掀算力风暴:CloudMatrix384超节点将升级,Tokens服务性能最大可超H20四倍
量子位·2025-09-19 04:11

Core Viewpoint - Huawei Cloud has made significant advancements in AI computing power, positioning itself as a key player in the industry amidst growing demand for computational resources driven by AI applications [1][4]. Group 1: Technological Advancements - The CloudMatrix384 super node, launched in April 2025, has evolved to enhance its capabilities, addressing the ongoing "computing power anxiety" in the AI industry [3][6]. - Huawei Cloud's AI server planning includes an upgrade of CloudMatrix specifications from 384 cards to a future capacity of 8192 cards, enabling the formation of massive AI clusters [5][19]. - The introduction of the EMS elastic memory storage service significantly reduces latency during multi-turn dialogues, enhancing overall performance [5][19]. Group 2: Market Positioning - Huawei Cloud's "computing power black land" concept provides fertile ground for enterprises and developers to innovate in AI, supported by a robust ecosystem of technological advancements [7][28]. - The strategy of combining intelligent computing (Tokens service) and general computing (Kunpeng cloud services) allows Huawei Cloud to meet diverse industry needs [9][11]. Group 3: Tokens Service - The Tokens service, based on the CloudMatrix384 super node, offers a new billing model that charges based on actual token consumption, significantly lowering AI inference costs [14][16]. - The daily average token consumption in China surged from 100 billion to over 30 trillion within a year and a half, indicating a dramatic increase in demand for AI computational resources [15]. Group 4: Industry Applications - Huawei Cloud's infrastructure supports various applications, including high-precision scientific research and AI-driven internet applications, demonstrating its versatility and capability to handle complex tasks [21][23]. - The collaboration with national research institutions, such as the Chinese Academy of Sciences, highlights Huawei Cloud's commitment to providing reliable and high-performance computing resources for advanced scientific models [25].