内存墙
Search documents
存储猛拉,AI存力超级周期到底有多神?
3 6 Ke· 2026-01-06 12:19
AI 浪潮的爆发彻底重塑存储行业格局,带动HBM、DRAM、NAND、HDD等全品类存储产品进入全面上行周期。 在AI需求的带动下,存储行业从HBM领域延伸至传统存储领域开启了本轮全面上行周期。以美光为例,在存储产品持续涨价的带动之下,公司的毛利率 已经到了相对高位。美光公司更是将下季度毛利率指引给到了66-68%,创出历史新高,这也意味着这轮存储周期的猛烈程度是高于以往的。 存储产品的涨价,其实本身也是存储市场供需关系的 反应 。本轮"供不应求"的现象,主要是由AI服务器等相关需求的带动。在当前对于本轮存储周期上 行已是共识的情况下,海豚君将主要围绕以下问题展开: 1)AI服务器中各类存储都是什么角色,当前AI存储面临怎么样的问题? 2)三大原厂重视的HBM需求如何,是否存在供需缺口吗? 3)AI需求爆发的情况下,对传统存储市场的影响如何,供给能跟上吗? 从供需角度来看:①需求端,AI服务器从训练向推理的重心转移,催生了对"低延迟、大容量、高带宽"存储的差异化需求;②供给端,存储厂商资本开支 向高附加值的HBM与DRAM倾斜,形成结构性供需失衡,推动产品价格大幅上涨。 本文主要先解答1和2这两个问题,至于传 ...
美国制造一颗真正的3D芯片
半导体行业观察· 2025-12-13 01:08
Core Insights - A collaborative team has developed the first monolithic 3D chip at a U.S. foundry, achieving unprecedented vertical wiring density and speed improvements [2][3] - This innovation is expected to usher in a new era of AI hardware and domestic semiconductor innovation, with the potential for a 1000-fold increase in hardware performance needed for future AI systems [3][7] Group 1: Performance and Design - The new 3D chip's performance is approximately an order of magnitude higher than traditional 2D chips, addressing the long-standing limitations of flat designs [2][3] - Early hardware tests indicate that the prototype chip outperforms similar 2D chips by about four times, with simulations suggesting up to a 12-fold performance increase for future versions [7] - The design allows for a significant enhancement in energy-delay product (EDP), potentially improving it by 100 to 1000 times, which balances speed and energy efficiency [7] Group 2: Technical Challenges and Solutions - Traditional 2D chips face a "memory wall" bottleneck due to the slow data transfer speeds compared to processing speeds, limiting overall system performance [4][5] - The new chip overcomes these challenges by vertically integrating memory and computation, allowing for faster data transmission and higher density connections [5][6] - Unlike previous attempts that relied on stacking separate chips, this new approach uses a continuous process to stack layers directly, enhancing connection density and manufacturability [6] Group 3: Implications for the Semiconductor Industry - The successful production of this 3D chip in a domestic foundry signifies a major step for U.S. semiconductor innovation, indicating that advanced architectures can be commercially viable [6][7] - The transition to vertical monolithic 3D integration will require a new generation of engineers skilled in these technologies, fostering a new wave of innovation in the semiconductor field [7][8] - The breakthrough not only enhances performance but also positions the U.S. to lead in the future of AI hardware development [8]
传迈威尔科技(MRVL.US)拟斥资50亿美元收购Celestial AI 押注光子互联破局“...
Xin Lang Cai Jing· 2025-12-02 06:57
Group 1 - Marvell Technology is in advanced talks to acquire Celestial AI for a deal potentially exceeding $5 billion, including cash and stock [1] - The acquisition is expected to enhance Marvell's product portfolio and highlight the strong demand for computing power in the market [1] - Celestial AI has raised a total of $515 million, with $250 million coming from a venture capital round supported by an AMD subsidiary [1][2] Group 2 - Celestial AI is focused on developing a photonic interconnect platform called Photonic Fabric to address the "memory wall" crisis in AI computing architectures [2] - The memory wall has become a significant barrier to system performance as the speed mismatch between computing units and memory leads to inefficiencies, especially with large AI models [2] - The acquisition of Celestial AI would provide Marvell with a strategic advantage in the evolving AI server market, particularly if photonic interconnect technology becomes a standard [3]
传迈威尔科技(MRVL.US)拟斥资50亿美元收购Celestial AI 押注光子互联破局“内存墙”
Zhi Tong Cai Jing· 2025-12-02 06:57
Group 1 - Marvell Technology is in advanced talks to acquire Celestial AI for a deal potentially exceeding $5 billion, with an announcement expected as early as December 3 [1] - The acquisition aims to enhance Marvell's product portfolio and reflects the strong market demand for computing power [1] - Celestial AI has raised a total of $515 million, including $250 million in venture capital from an AMD subsidiary [1][2] Group 2 - Celestial AI is focused on developing a photonic interconnect platform called Photonic Fabric to address the "memory wall" crisis in AI computing architectures [2] - The platform offers high bandwidth, low latency, and low power solutions, facilitating the expansion of AI accelerators from chip to multi-rack deployments [2] - The memory wall has become a significant barrier to system performance as AI model parameters grow, leading to inefficiencies due to mismatched data access speeds [2] Group 3 - The acquisition of Celestial AI would provide Marvell with a strategic advantage in the evolving AI server market, particularly if photonic interconnect technology becomes a standard [3] - Major players like AMD and Intel are looking to break NVIDIA's NVLink monopoly, making Celestial AI's technology crucial for extending their computing capabilities [3] - Marvell's potential ownership of core intellectual property from Celestial AI could enhance its competitiveness in securing next-generation orders from cloud giants like Microsoft and Google [3]
国泰海通:打破内存墙限制 AI SSD迎来广阔成长空间
智通财经网· 2025-10-28 12:33
Core Viewpoint - The report from Guotai Junan Securities highlights the challenges faced by large language models (LLMs) due to the "memory wall" issue, proposing SSD-based storage offloading technology as a new pathway for efficient AI model operation [1][2]. Industry Perspective and Investment Recommendations - The massive data generated by AI is straining global data center storage facilities, leading to a focus on SSDs as traditional Nearline HDDs face supply shortages. The industry is rated "overweight" [1][2]. - The growth of KV Cache capacity is surpassing the capabilities of High Bandwidth Memory (HBM), necessitating the optimization of computational efficiency and reduction of redundant calculations through KV Cache technology [2]. KV Cache Management and Technological Innovations - The industry is exploring tiered cache management technologies for KV Cache, with NVIDIA's Dynamo framework allowing for the offloading of KV Cache from GPU memory to CPU, SSD, and even network storage, addressing the memory bottleneck of large models [3]. - Samsung's proposal at the 2025 Open Data Center Conference suggests SSD-based storage offloading to enhance AI model performance, achieving significant reductions in token latency when KV Cache size exceeds HBM or DRAM capacity [3]. Market Dynamics and Supply Chain Adjustments - The demand for AI storage is driving a shift from HDDs to high-capacity Nearline SSDs, with NAND Flash suppliers accelerating production of ultra-large capacity SSDs (122TB and 245TB) in response to the supply gap in the HDD market [4].
国泰海通|电子:打破内存墙限制,AI SSD迎来广阔成长空间
国泰海通证券研究· 2025-10-28 12:00
Core Viewpoint - The article discusses the challenges faced by large language models (LLMs) due to the "memory wall" issue and proposes SSD-based storage offloading technology as a new path for efficient AI model operation [1]. Group 1: Industry Insights and Investment Recommendations - The massive data generated by AI is impacting global data center storage facilities, leading to a focus on KV Cache caching that can offload from GPU memory to CPU and SSD [1]. - The traditional Nearline HDD, which has been a cornerstone for massive data storage, is experiencing supply shortages, prompting a shift towards high-performance, high-cost SSDs, resulting in an "overweight" rating for the industry [1]. Group 2: KV Cache Technology and Its Implications - The growth of KV Cache capacity is exceeding the capabilities of HBM, as it temporarily stores generated tokens to optimize computational efficiency and reduce redundant calculations [2]. - As the demand for larger models and longer sequences increases, the reliance on HBM is becoming a bottleneck, leading to frequent memory overflows and performance issues [2]. Group 3: Technological Developments in Storage Solutions - The industry is exploring tiered caching management technologies for KV Cache, with NVIDIA launching a distributed inference service framework called Dynamo to offload KV Cache from GPU memory to CPU, SSD, and even network storage [3]. - Samsung has proposed an SSD-based storage offloading solution to address the "memory wall" challenge, which can reduce the first token latency by up to 66% and inter-token latency by up to 42% when KV Cache size exceeds HBM or DRAM capacity [3]. Group 4: Market Trends and Supply Chain Dynamics - The demand for AI storage is driving a replacement effect for HDDs, with NAND Flash suppliers accelerating the production of large-capacity Nearline SSDs due to significant supply gaps in the HDD market [4]. - NAND Flash manufacturers are investing in the production of ultra-large capacity Nearline SSDs, such as 122TB and even 245TB models, to meet the growing demand from AI inference applications [4].
AI存储赛道,华为再出招
Di Yi Cai Jing Zi Xun· 2025-08-27 11:29
Group 1 - Huawei launched AI SSD products, including the Huawei OceanDisk EX/SP/LC series, with capacities reaching up to 122/245 TB, marking the largest single-disk capacity in the industry [1] - The AI SSD is optimized for AI workloads, combining multiple core technologies developed by Huawei, and is expected to be a key breakthrough for domestic SSDs [1] - The rapid proliferation of AI applications has led to exponential data growth, with the total global internet corpus increasing from 350 PB (text) to 154 ZB (multi-modal), highlighting the limitations of traditional storage media [1] Group 2 - The model training phase faces significant challenges, requiring 13.4 TB of memory and 168 cards for training a 671B model, which severely limits training efficiency and flexibility [1] - The model inference phase also struggles with slow performance, with an average time to first token (TTFT) of 1000 ms, which is twice that of American models, and a token per second (TPS) rate of only 25, significantly impacting user experience [2] - High-performance AI SSDs are becoming the industry choice, but overseas manufacturers dominate the SSD market, with Samsung, SK Hynix, Micron, Kioxia, and SanDisk leading the market share [2] Group 3 - Despite the current dominance of HDDs in server storage, the advantages of SSDs in AI scenarios, such as energy efficiency and low operating costs, are driving rapid penetration, with SSDs expected to account for 9%-10% of server storage solutions by 2024 [2] - The domestic market is predicted to gradually replace HDDs with large-capacity QLC SSDs, facilitating a transition from a "capacity-oriented" to a "performance and capacity dual-optimization" model [3] - As of June 2023, China's storage capacity reached 1680 EB, showing significant growth and advancements in external flash memory applications, particularly in finance, manufacturing, and internet sectors [3]
算力:从英伟达的视角看算力互连板块成长性 - Scale Up 网络的“Scaling Law”存在吗?
2025-08-21 15:05
Summary of Conference Call on Scale Up Network Growth from NVIDIA's Perspective Industry Overview - The discussion revolves around the **Scale Up network** in the context of **NVIDIA** and its implications for the broader **computing power** industry, particularly in AI and parallel computing applications [1][5][9]. Core Insights and Arguments - **Scaling Law**: The concept of a "Scaling Law" in networks is proposed, emphasizing the need for larger cross-cabinet connections rather than just existing ASIC and cabinet solutions [1][5]. - **NVIDIA's Strategy**: NVIDIA aims to address hardware memory wall issues and parallel computing demands by increasing **Nvlink bandwidth** and expanding the **Up scale** from H100 to GH200, although initial adoption was low due to high costs and insufficient inference demand [6][8]. - **Memory Wall**: The memory wall refers to the disparity between the rapid growth of model parameters and computing power compared to memory speed, necessitating more HBM interconnect support for model inference and GPU operations [1][10]. - **Performance Metrics**: The GB200 card shows significant performance differences compared to B200, with a threefold performance gap at 10 TPS, which increases to sevenfold at 20 TPS, highlighting the advantages of Scale Up networks under increased communication pressure [4][14][15]. - **Future Demand**: As Scale Up demand becomes more apparent, segments such as **fiber optics**, **AEC**, and **switches** are expected to benefit significantly, driving market growth [9][28]. Additional Important Points - **Parallel Computing**: The evolution of computing paradigms is shifting towards GPU-based parallel computing, which includes various forms such as data parallelism and tensor parallelism, each with different communication frequency and data size requirements [11][12]. - **Network Expansion Needs**: The need for a second-layer network connection between cabinets is emphasized, with recommendations for using fiber optics and AEC to facilitate this expansion [4][23][24]. - **Market Trends**: The overall network connection growth rate is anticipated to outpace chip demand growth, benefiting the optical module and switch industries significantly [28][30]. - **Misconceptions in Market Understanding**: There is a prevalent misconception that Scale Up networks are limited to cabinet-level solutions, whereas they actually require larger networks composed of multiple cabinets to meet user TPS demands effectively [29][30]. This summary encapsulates the key points discussed in the conference call, providing insights into the growth potential and strategic direction of the Scale Up network within the computing power industry.
从英伟达的视角看算力互连板块成长性——Scale Up网络的“Scaling Law”存在吗? | 投研报告
Zhong Guo Neng Yuan Wang· 2025-08-20 07:47
Core Insights - The report emphasizes the necessity of Scale Up networks due to the "memory wall" issue and the evolution of AI computing paradigms, which necessitate the pooling of memory through Scale Up solutions [1][3] Group 1: Scale Up Network Expansion - Nvidia is continuously expanding the Scale Up network through two main paths: enhancing single-card bandwidth with NVLink 5.0 achieving 7200 Gb/s and increasing supernode sizes from H100NVL8 to GH200 and GB200 [2] - The Scale Up network is expected to follow a Scaling Law, with the second layer of Scale Up networks emerging, requiring a specific ratio of optical and AEC connections to chips [2][4] Group 2: Addressing the Memory Wall - The "memory wall" problem is characterized by the growing gap between the parameter size of large models and single-card memory, as well as the disparity between single-card computing power and memory [3] - To enhance computational efficiency, various parallel computing methods are employed, including data parallelism, pipeline parallelism, tensor parallelism, and expert parallelism, which significantly increase communication frequency and capacity requirements [3] Group 3: Need for Larger Scale Up Networks - The demand for larger Scale Up networks is driven by Total Cost of Ownership (TCO), user experience, and model capability expansion, as the Tokens Per Second (TPS) consumed by single users is expected to rise [3] - The report suggests that the Scale Up size is non-linearly related to expected single-user TPS and actual single-card performance, indicating a need for larger Scale Up networks to maintain performance [3] Group 4: Building Larger Scale Up Networks - To construct larger Scale Up networks, a second layer of Scale Up switches is needed between cabinets, with optical and AEC connections expected to coexist in the new network structure [4] - The report highlights that each GPU requires nine additional equivalent 1.6T connections, which is 3-4.5 times that of Scale Out networks, and every four GPUs necessitate an additional switch, which is 7.5-12 times that of Scale Out networks [4] Group 5: Investment Opportunities - The ongoing demand for Scale Up networks is anticipated to drive exponential growth in network connection requirements, benefiting sectors such as optical interconnects and switches [4] - Relevant companies in the optical interconnect space include Zhongji Xuchuang, Xinyi Sheng, and Tianfu Tong, while switch manufacturers include Ruijie Networks and Broadcom [5]
一觉醒来,中国打碎美国关键科技封锁,迎来了扬眉吐气的一刻
Sou Hu Cai Jing· 2025-08-15 21:38
Core Viewpoint - The article discusses China's breakthrough in developing domestic High Bandwidth Memory (HBM) technology, overcoming reliance on imports and U.S. export controls, which previously threatened the AI industry's growth [1][12][24]. Group 1: HBM Technology and Its Importance - HBM is likened to a "super oil tank" for AI, crucial for providing the necessary data supply to high-performance computing systems [5][10]. - The "memory wall" problem in computing is addressed by HBM, which allows for vertical stacking of memory chips, significantly improving data transfer speeds and reducing energy consumption [9][10]. - The successful development of domestic HBM3 samples marks a significant milestone for China, making it the third country globally to enter the HBM market after the U.S. and South Korea [22][24]. Group 2: Impact of U.S. Export Controls - In late 2024, the U.S. imposed export controls on HBM, severely impacting China's AI industry by cutting off access to critical technology [12][14]. - The export restrictions highlighted the vulnerability of China's high-performance computing projects, which were heavily reliant on imported HBM [14][16]. Group 3: Development and Collaboration - The development of domestic HBM involved extensive collaboration among semiconductor companies, packaging and testing firms, and research institutions, forming a powerful coalition to tackle technical challenges [20][22]. - The successful creation of HBM3 samples was achieved within eight months, showcasing the dedication and innovation of Chinese engineers in the face of external pressures [16][18]. Group 4: Strategic Significance - The advancement in HBM technology provides a foundational security for national-level computing projects, allowing China to build data centers and intelligent computing centers without dependency on foreign components [25][27]. - The domestic market for HBM is projected to account for nearly one-third of global demand by 2025, creating a valuable environment for iterative improvements and innovation [27][29].