Workflow
存储卸载技术
icon
Search documents
国泰海通|电子:打破内存墙限制,AI SSD迎来广阔成长空间
Core Viewpoint - The article discusses the challenges faced by large language models (LLMs) due to the "memory wall" issue and proposes SSD-based storage offloading technology as a new path for efficient AI model operation [1]. Group 1: Industry Insights and Investment Recommendations - The massive data generated by AI is impacting global data center storage facilities, leading to a focus on KV Cache caching that can offload from GPU memory to CPU and SSD [1]. - The traditional Nearline HDD, which has been a cornerstone for massive data storage, is experiencing supply shortages, prompting a shift towards high-performance, high-cost SSDs, resulting in an "overweight" rating for the industry [1]. Group 2: KV Cache Technology and Its Implications - The growth of KV Cache capacity is exceeding the capabilities of HBM, as it temporarily stores generated tokens to optimize computational efficiency and reduce redundant calculations [2]. - As the demand for larger models and longer sequences increases, the reliance on HBM is becoming a bottleneck, leading to frequent memory overflows and performance issues [2]. Group 3: Technological Developments in Storage Solutions - The industry is exploring tiered caching management technologies for KV Cache, with NVIDIA launching a distributed inference service framework called Dynamo to offload KV Cache from GPU memory to CPU, SSD, and even network storage [3]. - Samsung has proposed an SSD-based storage offloading solution to address the "memory wall" challenge, which can reduce the first token latency by up to 66% and inter-token latency by up to 42% when KV Cache size exceeds HBM or DRAM capacity [3]. Group 4: Market Trends and Supply Chain Dynamics - The demand for AI storage is driving a replacement effect for HDDs, with NAND Flash suppliers accelerating the production of large-capacity Nearline SSDs due to significant supply gaps in the HDD market [4]. - NAND Flash manufacturers are investing in the production of ultra-large capacity Nearline SSDs, such as 122TB and even 245TB models, to meet the growing demand from AI inference applications [4].