Workflow
高性能计算群星闪耀时
NvidiaNvidia(US:NVDA) 雷峰网·2025-08-18 11:37

Core Viewpoint - The article emphasizes the critical role of high-performance computing (HPC) in the development and optimization of large language models (LLMs), highlighting the synergy between hardware and software in achieving efficient model training and inference [2][4][19]. Group 1: HPC's Role in LLM Development - HPC has become essential for LLMs, with a significant increase in researchers from HPC backgrounds contributing to system software optimization [2][4]. - The evolution of HPC in China has gone through three main stages, from self-developed computers to the current era of supercomputers built with self-developed processors [4][5]. - Tsinghua University's HPC research institute has played a pioneering role in China's HPC development, focusing on software optimization for large-scale cluster systems [5][11]. Group 2: Key Figures in HPC and AI - Zheng Weimin is recognized as a pioneer in China's HPC and storage fields, contributing significantly to the development of scalable storage solutions and cloud computing platforms [5][13]. - The article discusses the transition of Tsinghua's HPC research focus from traditional computing to storage optimization, driven by the increasing importance of data handling in AI applications [12][13]. - Key researchers like Chen Wenguang and Zhai Jidong have shifted their focus to AI systems software, contributing to the development of frameworks for optimizing large models [29][31]. Group 3: Innovations in Model Training and Inference - The article details the development of the "Eight Trigrams Furnace" system for training large models, which significantly improved the efficiency of training processes [37][39]. - Innovations such as FastMoE and SmartMoE frameworks have emerged to optimize the training of mixture of experts (MoE) models, showcasing the ongoing advancements in model training techniques [41][42]. - The Mooncake and KTransformers systems have been developed to enhance inference efficiency for large models, utilizing shared storage to reduce computational costs [55][57].