算力资源利用
Search documents
破解算力资源利用难题,华为联合三大高校发布并开源AI容器技术
Guan Cha Zhe Wang· 2025-11-24 02:05
Core Viewpoint - Huawei's data storage product line has launched the AI container technology Flex:ai, in collaboration with Shanghai Jiao Tong University, Xi'an Jiaotong University, and Xiamen University, to address the low utilization of computing resources in the AI industry [1][3] Group 1: AI Industry Challenges - The rapid development of the AI industry has led to a massive demand for computing power, but global utilization rates remain low, resulting in significant resource waste [1] - Issues include small model tasks monopolizing entire cards, large model tasks lacking sufficient single-machine computing power, and many general servers being in a state of "sleep" due to a lack of GPU/NPU [1] Group 2: Flex:ai Technology Features - Flex:ai is built on the Kubernetes container orchestration platform, enabling precise management and intelligent scheduling of GPU and NPU resources to match AI workloads, significantly improving computing resource utilization [3] - The technology integrates research strengths from three universities and Huawei, resulting in three core technological breakthroughs [3] Group 3: Specific Technological Solutions - The XPU pooling framework developed with Shanghai Jiao Tong University addresses the issue of resource waste in small model training by allowing a single GPU or NPU card to be divided into multiple virtual computing units, improving overall utilization by 30% [4] - The cross-node virtualization technology developed with Xiamen University aggregates idle XPU resources into a "shared computing pool," enabling general servers to forward AI workloads to remote GPU/NPU cards, thus integrating general and intelligent computing resources [4] - The Hi Scheduler intelligent scheduler, developed with Xi'an Jiaotong University, optimally schedules heterogeneous computing resources across multiple brands and specifications, ensuring stable operation of AI workloads even under fluctuating loads [5]
华为发布AI新技术
Zheng Quan Shi Bao· 2025-11-21 12:05
Core Insights - Huawei officially launched the AI container technology Flex:ai, aimed at addressing the low utilization of computing resources in the AI industry, which is facing a surge in demand for computational power [1] Group 1: Technology Innovations - The Flex:ai technology includes XPU pooling and scheduling software built on the Kubernetes platform, enabling precise management and intelligent scheduling of GPU and NPU resources to significantly enhance computing resource utilization [1] - The XPU pooling framework allows a single GPU or NPU card to be divided into multiple virtual computing units, improving average utilization by 30% in scenarios where one card runs one task [2] - The cross-node virtualization technology aggregates idle XPU resources from various nodes into a shared computing pool, allowing general servers to forward AI workloads to remote GPU/NPU cards, thus integrating general and intelligent computing resources [2] Group 2: Intelligent Scheduling - The Hi Scheduler, developed in collaboration with Xi'an Jiaotong University, enables optimal scheduling of heterogeneous computing resources by automatically sensing cluster loads and resource states, ensuring stable operation of AI workloads even under fluctuating loads [3] - The comprehensive open-sourcing of Flex:ai will provide access to core technological capabilities for developers across academia and industry, promoting the establishment of standardized solutions for efficient computing resource utilization in the global AI industry [3]
华为发布AI新技术
证券时报· 2025-11-21 12:03
Core Viewpoint - Huawei officially launched the AI container technology Flex:ai, aimed at addressing the low utilization of computing resources in the AI industry, which is critical for its development [1][2]. Group 1: Technology Development - Flex:ai is a pooling and scheduling software based on the Kubernetes platform, designed to optimize the management and scheduling of GPU and NPU resources, significantly improving computing resource utilization [2]. - The technology integrates research from three universities and Huawei, achieving breakthroughs in three core areas: resource partitioning, cross-node resource aggregation, and multi-level intelligent scheduling [2][3]. Group 2: Key Innovations - Resource partitioning allows a single GPU or NPU to be divided into multiple virtual computing units, improving average utilization by 30% in scenarios where small AI models are trained [2]. - Cross-node resource aggregation creates a "shared computing pool" from idle computing resources across nodes, enabling general servers to forward AI workloads to remote GPU/NPU resources [3]. - The Hi Scheduler intelligently matches AI workloads with computing resources, ensuring optimal resource allocation even under fluctuating loads [3]. Group 3: Open Source Initiative - The comprehensive open-sourcing of Flex:ai will provide developers from academia and industry access to all core technological capabilities, fostering global innovation and standardization in heterogeneous computing virtualization [4].
华为将发布“突破性AI技术”,有望大幅提升算力资源利用率
Xuan Gu Bao· 2025-11-16 23:44
Group 1 - Huawei is set to release a breakthrough technology in the AI field on November 21, which aims to improve the efficiency of computing resource utilization from the industry average of 30%-40% to 70% [1] - The new technology will enable unified resource management and utilization of computing power from various sources, including GPUs and NPUs, enhancing support for AI training and inference [1] - Huawei's AI chip roadmap includes three series of products: 950PR/950DT, 960, and 970, with the 950PR expected to launch in Q1 next year, featuring enhanced interconnect bandwidth and computing power [1] Group 2 - Analysts from Dongfang Securities believe that the future of computing power competition will shift from GPUs to supernodes, which will become the mainstream form of computing power [2] - The emergence of large-scale supernodes, such as those showcased by Nvidia and Huawei, is expected to significantly enhance the training efficiency of domestic large models in China [2] - The domestic computing power market is anticipated to experience a boom in both large model training and AI capital expenditure next year [2] Group 3 - Tuowei Information is a strategic partner of Huawei in the "Cloud + Kunpeng/Ai + Industry Large Model + Open Source Harmony" field, developing innovative solutions based on the Harmony operating system [3] - Huafeng Technology, as a high-speed line module supplier, has established deep partnerships with leading domestic AI server manufacturers, including Huawei, ZTE, and Alibaba [3]