Workflow
UEC规范1.0
icon
Search documents
UEC终于来了,能撼动InfiniBand吗?
半导体行业观察· 2025-06-12 00:42
Core Viewpoint - The Super Ethernet Consortium (UEC) has released UEC Specification 1.0, a comprehensive Ethernet-based communication stack designed to meet the demanding requirements of modern AI and high-performance computing (HPC) workloads, marking a significant step towards redefining next-generation data-intensive infrastructure [1][3] Group 1: UEC Specification Overview - UEC Specification 1.0 provides high-performance, scalable, and interoperable solutions across all layers of the network stack, including NICs, switches, fiber optics, and cables, facilitating seamless multi-vendor integration and accelerating ecosystem innovation [1][3] - The specification aims to promote the adoption of open, interoperable standards to avoid vendor lock-in, paving the way for a unified and accessible ecosystem across the industry [1][3] - The UEC project operates under the Linux Joint Development Foundation (JDF) and is designed to optimize horizontal scaling networks for AI training, inference, and HPC, with a focus on achieving round-trip times of 1 to 20 microseconds [14][16] Group 2: Technical Features and Innovations - UEC is built on globally adopted Ethernet standards, simplifying the deployment of the entire technology stack from hardware to applications, making it particularly valuable for cloud infrastructure operators, hyperscale enterprises, DevOps teams, and AI engineers [3][12] - The specification includes a modern RDMA for Ethernet and IP, supporting intelligent, low-latency transmission in high-throughput environments [7] - UEC introduces a congestion control system (UEC-CC) that operates with a time-based mechanism, measuring transmission time with precision below 500 nanoseconds, allowing for accurate congestion attribution [27][30] Group 3: Interoperability and Compatibility - UEC is designed to ensure interoperability among devices from different vendors, with a focus on how APIs interact with CPUs or GPUs without limitations [16][17] - The specification emphasizes the importance of LibFabric, a widely adopted API that standardizes the use of NICs, facilitating compatibility with high-performance network libraries essential for AI or HPC superclusters [14][17] - UEC's architecture allows for the integration of multiple endpoints, supporting configurations that can connect up to 512 endpoints through a single NIC [22][24] Group 4: Comparison with Other Standards - UEC is compared with other standards like Ultra-Accelerator Link (UALink) and Scale-Up Ethernet (SUE), highlighting its broader goal of building horizontally scalable networks with thousands of endpoints, unlike UALink and SUE, which focus on single switch layers [40][44] - UEC's approach to traffic control and congestion management is distinct, as it abandons older methods like RoCE and DCQCN, which could hinder performance [32][39] - The specification's complexity is noted, with a detailed structure that may increase interoperability testing challenges, but it is designed to provide significant performance benefits in data center environments [37][39]