UALink协议

Search documents
阿里云栖大会第一日——超节点
小熊跑的快· 2025-09-24 04:38
Core Viewpoint - The article discusses the advancements in computing power architecture, particularly focusing on Alibaba Cloud's new supernode design and its implications for large model training and inference in the AI sector [4][10]. Group 1: Supernode Design and Technology - Alibaba Cloud's supernode architecture addresses the increasing demands for memory capacity and bandwidth in large model training, moving beyond traditional GPU setups [4]. - The supernode design leverages the advantages of PPU chip design, emphasizing high-density integration [6]. - The supernode can support up to 64 cards in a single machine, with a power requirement of 300 kilowatts, necessitating advanced interconnect protocols [9]. Group 2: UALink Protocol and Industry Collaboration - The UALink protocol, initiated by a consortium including AMD, AWS, and others, aims to enhance interconnectivity in computing systems, with Alibaba Cloud as a member [5]. - The UALink alliance was formed to address the high costs of evolving proprietary technologies in the industry, with AMD contributing its Infinity Fabric protocol [5]. Group 3: PPU Specifications and Performance - The PPU features 96GB of HBM2e memory, surpassing the A800's 80GB and matching the H20's capacity, with an inter-chip bandwidth of 700GB/s [10]. - The PPU supports PCIe 5.0×15 interfaces, which is an improvement over the A800's PCIe 4.0×16, while maintaining a power consumption of 400W [10]. - The PPU is available in two versions, with the base version achieving a peak performance of 120 TFLOPS, focusing on AI inference tasks [10].