Workflow
xDeepServe分布式推理框架
icon
Search documents
华为携手伙伴共同发起第四届828 B2B企业节,Tokens服务助十万企业AI落
Yang Zi Wan Bao Wang· 2025-08-28 08:42
Core Insights - The 4th 828 B2B Enterprise Festival opened in Guiyang, aiming to accelerate AI application across various industries through technology accessibility and ecosystem collaboration [1][2] - Huawei is committed to building a national computing power hub in Guizhou, enhancing its cloud services to support enterprise digitalization and intelligence [2][3] - The festival showcased over 12,000 new products and nearly 600 selected intelligent products and solutions, promoting cost reduction and innovation for enterprises [5] Group 1: Event Overview - The festival was co-initiated by Huawei and 17 leading companies, focusing on AI, intelligent computing, and data technologies [1] - Key figures from the government and Huawei delivered speeches emphasizing the importance of digital transformation and collaboration [1] Group 2: Technological Advancements - Huawei Cloud announced the integration of its Tokens service with the CloudMatrix384 super node, achieving high throughput and low latency performance [3] - The new computing architecture and hardware optimizations are designed to enhance AI application efficiency [3][4] Group 3: Industry Collaboration - Various industry leaders shared their AI innovation practices based on Huawei Cloud, providing benchmarks for other enterprises [4] - The festival included signing agreements for national intelligent enterprise computing power cooperation, promoting resource integration and AI technology implementation [4] Group 4: Future Initiatives - The upcoming 828 National Action Month will include initiatives for accelerating AI applications and supporting over 100,000 enterprises with subsidies [5]
单芯片最高2400TPS,华为云Tokens服务全面接入384超节点
Guan Cha Zhe Wang· 2025-08-27 13:10
Core Viewpoint - Huawei Cloud has announced the full integration of its Tokens service with the CloudMatrix384 super node, achieving a significant performance breakthrough with a maximum throughput of 2400 TPS and a low latency of 50 ms, surpassing industry standards [1][2]. Group 1: AI Computing Demand and Tokens Service - Over the past 18 months, the demand for AI computing power in China has grown exponentially, with daily Token consumption increasing from 100 billion at the beginning of 2024 to over 30 trillion by June 2023, a growth of over 300 times in just 1.5 years [2]. - Huawei Cloud launched its Tokens service based on MaaS in March 2023, offering various service specifications to meet different performance and latency requirements for AI tools [2]. - The integration of Tokens service with CloudMatrix384 has led to an increase in throughput from 1920 TPS at the beginning of the year to 2400 TPS [2]. Group 2: Full-Stack Innovation and Architecture - The construction of large computing power is a full-stack innovation encompassing hardware, software, operators, storage, inference frameworks, and super nodes, leveraging Huawei's comprehensive capabilities [4]. - The CloudMatrix384 super node features a new computing architecture that breaks performance bottlenecks and establishes a robust computing foundation [4]. - The CANN Ascend hardware optimizes operators and communication strategies, enabling efficient utilization of cloud computing power [4]. Group 3: xDeepServe and Performance Enhancement - xDeepServe, as a native service of CloudMatrix384, utilizes a Transformerless architecture to decompose large models into independent micro-modules, allowing for parallel processing across different NPUs [5][6]. - The performance of Tokens service has improved from 600 tokens/s on non-super nodes to 2400 tokens/s on super nodes through continuous optimization of xDeepServe [6]. - FlowServe, a restructured decentralized distributed engine, allows for autonomous DP groups within CloudMatrix384, ensuring high concurrency without congestion [6]. Group 4: Model Performance and Industry Applications - Huawei Cloud's MaaS service supports major large models and has developed capabilities for model performance optimization, achieving twice the output speed of mainstream platforms for image generation [8]. - The company has partnered with over 100 organizations to develop AI Agents across various industry scenarios, enhancing efficiency in fields such as analysis, content creation, and smart operations [8][9]. - The introduction of intelligent solutions, such as the talent digital employee solution, demonstrates the application of advanced technologies to improve service efficiency and customer satisfaction [9].