Workflow
AI应用平台对接
icon
Search documents
华为发布开源AI容器技术Flex:ai:让闲置算力“动起来”,把一张卡切给多任务使用丨最前线
3 6 Ke· 2025-11-25 13:54
Core Viewpoint - The simultaneous occurrence of "insufficient computing power" and "wasted computing power" is highlighted, with Huawei's release of the AI container technology Flex:ai aimed at improving computing resource utilization through three technological innovations [1] Group 1: Flex:ai Overview - Huawei officially launched the Flex:ai technology at the 2025 AI Container Application Landing and Development Forum, which includes the open-sourcing of the XPU pooling and scheduling software [1][2] - Flex:ai is built on Kubernetes and focuses on the refined management and intelligent scheduling of GPU, NPU, and other intelligent computing resources, consolidating scattered computing power into a "resource pool" [1][2] Group 2: Core Capabilities of Flex:ai - The XPU pooling framework, developed in collaboration with Shanghai Jiao Tong University, allows a single GPU or NPU card to be split into multiple virtual computing units with 10% precision, increasing overall computing utilization by 30% in small model training and inference scenarios [2] - The cross-node remote virtualization technology, developed with Xiamen University, aggregates idle XPU computing power across different machines to form a "shared computing pool," enabling general servers without intelligent computing capabilities to access remote GPU/NPU resources for AI calculations [2] - The Hi Scheduler intelligent scheduler, developed with Xi'an Jiaotong University, addresses the challenge of unified scheduling of heterogeneous computing resources by automatically selecting suitable local or remote resources based on task priority and computing requirements, achieving time-sharing reuse and global optimal scheduling [2] Group 3: Open Source Initiative - Huawei's decision to fully open source Flex:ai aims to provide all core technological capabilities to developers across academia and industry, promoting the construction of standards for heterogeneous computing virtualization and AI application platform integration [2]