Workflow
破解算力资源利用难题,华为联合三大高校发布并开源AI容器技术
Guan Cha Zhe Wang·2025-11-24 02:05

Core Viewpoint - Huawei's data storage product line has launched the AI container technology Flex:ai, in collaboration with Shanghai Jiao Tong University, Xi'an Jiaotong University, and Xiamen University, to address the low utilization of computing resources in the AI industry [1][3] Group 1: AI Industry Challenges - The rapid development of the AI industry has led to a massive demand for computing power, but global utilization rates remain low, resulting in significant resource waste [1] - Issues include small model tasks monopolizing entire cards, large model tasks lacking sufficient single-machine computing power, and many general servers being in a state of "sleep" due to a lack of GPU/NPU [1] Group 2: Flex:ai Technology Features - Flex:ai is built on the Kubernetes container orchestration platform, enabling precise management and intelligent scheduling of GPU and NPU resources to match AI workloads, significantly improving computing resource utilization [3] - The technology integrates research strengths from three universities and Huawei, resulting in three core technological breakthroughs [3] Group 3: Specific Technological Solutions - The XPU pooling framework developed with Shanghai Jiao Tong University addresses the issue of resource waste in small model training by allowing a single GPU or NPU card to be divided into multiple virtual computing units, improving overall utilization by 30% [4] - The cross-node virtualization technology developed with Xiamen University aggregates idle XPU resources into a "shared computing pool," enabling general servers to forward AI workloads to remote GPU/NPU cards, thus integrating general and intelligent computing resources [4] - The Hi Scheduler intelligent scheduler, developed with Xi'an Jiaotong University, optimally schedules heterogeneous computing resources across multiple brands and specifications, ensuring stable operation of AI workloads even under fluctuating loads [5]