Workflow
UEC
icon
Search documents
从芯粒到机柜:聊聊大模型浪潮下的开放互连
半导体行业观察· 2025-12-02 01:37
Core Insights - The article emphasizes the importance of open interconnect standards like UCIe, CXL, UAL, and UEC in the AI infrastructure landscape, highlighting their roles in enhancing hardware ecosystems and addressing the challenges posed by large model training and inference [2][10]. Group 1: Background and Evolution - The establishment of the CXL Alliance in March 2019 aimed to tackle challenges related to heterogeneous XPU programming and memory bandwidth expansion, with Alibaba being a founding member [4]. - The UCIe Alliance was formed in March 2022 to create an open Die-to-Die interconnect standard, with Alibaba as the only board member from mainland China [4]. - The UEC Alliance was established in July 2023 to address the inefficiencies of traditional Ethernet in AI and HPC environments, with Alibaba joining as a General member [4]. - The UAL Alliance was formed in October 2024 to meet the growing demands for Scale-up networks due to increasing model sizes and inference contexts, with Alibaba also joining as a board member [4]. Group 2: Scaling Laws in AI Models - The article outlines three phases of scaling laws: Pre-training Scaling, Post-training Scaling, and Test-time Scaling, with a shift in focus towards Test-time Scaling as models transition from development to application [5][8]. - Test-time Scaling introduces new challenges for AI infrastructure, particularly regarding latency and throughput requirements [8]. Group 3: UCIe and Chiplet Design - UCIe is positioned as a critical standard for chiplet interconnects, addressing cost, performance, yield, and process node optimization in chip design [10][11]. - The article discusses the advantages of chiplet-based designs, including improved yield, process node optimization, cross-product reuse, and market scalability [14][15][17]. - UCIe's protocol stack is designed to meet the specific needs of chiplet interconnects, including low latency, high bandwidth density, and support for various packaging technologies [18][19][21]. Group 4: CXL and Server Architecture - CXL aims to redefine server architectures by enabling memory pooling and extending host memory capacity through CXL memory modules [29][34]. - Key features of CXL include memory pooling, unified memory space, and host-to-host communication capabilities, which enhance AI infrastructure efficiency [30][35]. - The article highlights the challenges CXL faces, such as latency issues due to PCIe PHY limitations and the complexity of implementing CXL.cache [34][35]. Group 5: UAL and Scale-Up Networks - UAL is designed to support Scale-Up networks, allowing for efficient memory semantics and reduced protocol overhead [37][43]. - The UAL protocol stack includes layers for protocol, transaction, data link, and physical layers, facilitating high-speed communication and memory operations [43][45]. - UAL's architecture aims to provide a unified memory space across multiple nodes, addressing the unique communication needs of large AI models [50][51].
英伟达入局、博通守擂,AI定制芯片酣战
Core Insights - Broadcom is facing new competition from Nvidia, which has launched NVLink Fusion targeting the high-growth AI custom chip market [1][4] - The ASIC chip market is experiencing sustainable growth opportunities, with Broadcom reporting record revenue driven by strong AI demand [1][13] - The shift in AI computing demand from training to inference is reshaping the AI chip market dynamics [1][2] Company Performance - Broadcom achieved a record revenue of $15.004 billion in Q2 FY2025, with AI business revenue growing 46% year-over-year to over $4.4 billion [1][13] - The company expects AI semiconductor revenue to accelerate to $5.1 billion in Q3, marking ten consecutive quarters of growth [1] Market Dynamics - Nvidia's entry into the ASIC market introduces competition, as the demand for lower-cost ASIC chips for AI inference is rising [3][4] - The AI inference market is projected to be larger than the AI training market, with ASIC shipments expected to surpass GPU shipments by 2028 [3][14] Ecosystem Competition - The NVLink Fusion technology is seen as a strategic move by Nvidia to strengthen its position against competitors like AMD and Intel while potentially benefiting partners like Marvell and Broadcom [5][7] - UALink and UEC alliances are emerging to counter Nvidia's NVLink, focusing on creating a more open ecosystem for chip interconnectivity [9][10] Future Outlook - The performance advantages of NVLink Fusion may dominate the AI training market in the short term, while UALink's openness could attract more manufacturers in the mid to long term [10][11] - The demand for ASIC chips is expected to grow significantly, with cloud service providers increasingly seeking tailored solutions for specific computational needs [14]