Workflow
Memory Wall
icon
Search documents
烦人的内存墙
半导体行业观察· 2026-02-02 01:33
公众号记得加星标⭐️,第一时间看推送不会错过。 前所未有的无监督训练数据的可用性,以及神经网络的扩展规律,导致用于服务/训练低层逻辑模型 (LLM)的模型规模和计算需求出现了前所未有的激增。然而,主要的性能瓶颈正日益转移到内存 带宽上。 过去20年,服务器硬件的峰值浮点运算能力(FLOPS)以每两年3倍的速度增长,超过了DRAM和互 连带宽的增长速度,后两者分别仅以每两年1.6倍和1.4倍的速度增长。这种差距使得内存而非计算成 为人工智能应用(尤其是服务应用)的主要瓶颈。 本文分析了编码器和解码器Transformer模型,并展示了内存带宽如何成为解码器模型的主要瓶颈。 我们提出重新设计模型架构、训练和部署策略,以克服这一内存限制。 引言 近年来,训练大型语言模型 (LLM) 所需的计算量以每两年 750 倍的速度增长。这种指数级增长趋势 是人工智能加速器发展的主要驱动力,这些加速器致力于提升硬件的峰值计算能力,但往往以牺牲其 他部分(例如内存层次结构)的简化为代价。 然而,这些趋势忽略了训练和服务人工智能模型过程中一个新兴的挑战:内存和通信瓶颈。事实上, 许多人工智能应用的瓶颈并非计算能力,而是芯片内部/芯 ...
AI memory is sold out, causing an unprecedented surge in prices
CNBC· 2026-01-10 12:00
Core Insights - The global demand for RAM is exceeding supply due to the high requirements from companies like Nvidia, AMD, and Google for their AI chips [1][2] - Major memory vendors Micron, SK Hynix, and Samsung are experiencing significant business growth due to this surge in demand [2][3] Company Performance - Micron's stock has increased by 247% over the past year, with net income nearly tripling in the latest quarter [3] - Samsung anticipates its operating profit for the December quarter to nearly triple, while SK Hynix is considering a U.S. listing due to rising stock prices [3] Price Trends - TrendForce predicts that average DRAM memory prices will rise by 50% to 55% in the current quarter compared to Q4 2025, marking an unprecedented increase [4] - The price of RAM for consumers has surged dramatically, with examples of costs rising from approximately $300 to around $3,000 within months [9] Memory Technology - HBM (high-bandwidth memory) is essential for AI chips and is produced through a complex process that limits the production of conventional memory [6][7] - The demand for HBM is prioritized over other memory types due to higher growth potential in server and AI applications [7] Industry Challenges - Micron has decided to discontinue certain consumer memory products to allocate more supply for AI chips and servers [8] - The memory shortage is expected to impact consumer electronics companies, with memory costs now accounting for about 20% of laptop hardware costs, up from 10%-18% in early 2025 [15] Future Outlook - Nvidia's CEO highlighted the need for more memory factories to meet the high demand driven by AI applications [18] - Micron is building new factories in Idaho and New York, expected to come online in 2027, 2028, and 2030, respectively, but currently, they are "sold out for 2026" [19][20]
Why Astera’s Leo Deployment on Azure M-Series Signals Progress on the Memory Wall
Yahoo Finance· 2025-12-08 16:08
Group 1 - Astera Labs, Inc. is recognized as one of the fastest-growing semiconductor stocks, with its Leo CXL® Smart Memory Controllers recently enabled on Microsoft Azure M-series VMs, marking a significant deployment in the industry [1][2] - The Leo controllers support CXL 2.0 and can handle up to 2TB per controller, enabling cloud providers to scale server memory capacity by more than 1.5 times, addressing the "memory wall" challenge in data-intensive applications [2][3] - The deployment is aimed at enhancing memory expansion for workloads such as in-memory databases, AI inference, KV-cache for large language models, and big-data analytics [1][2] Group 2 - Astera Labs specializes in semiconductor-based connectivity solutions tailored for rack-scale AI infrastructure, with a focus on extending and pooling memory for cloud and AI workloads [3]
无限人工智能计算循环:HBM 三巨头 + 台积电 × 英伟达 ×OpenAI 塑造下一代产业链-The Infinite AI Compute Loop_ HBM Big Three + TSMC × NVIDIA × OpenAI Shaping the Next-Generation Industry Chain
2025-10-20 01:19
Summary of Key Points from the Conference Call Industry Overview - The AI industry is experiencing unprecedented acceleration, with a focus on compute architectures, interconnect technologies, and memory bottlenecks, primarily driven by key companies like NVIDIA, TSMC, and OpenAI [4][16][39] - The concept of the "AI perpetual motion cycle" is introduced, where AI chips drive compute demand, which in turn stimulates infrastructure investment, further expanding AI chip applications [4][16] Key Companies and Technologies - **NVIDIA**: Significant investments have popularized the AI perpetual motion cycle, with a shift in strategy from Scale Up and Scale Out to Scale Across, promoting Optical Circuit Switching (OCS) [4][10] - **TSMC**: Central to the entire AI infrastructure, TSMC's advanced process and packaging capabilities support the entire stack from design to system integration [6][8][17] - **OpenAI**: Transitioning from reliance on NVIDIA to developing custom AI ASICs in collaboration with Broadcom, indicating a shift in power dynamics within the supply chain [60][62] Memory and Bandwidth Challenges - The widening "memory wall" is a critical focus, as GPU performance is advancing faster than High Bandwidth Memory (HBM), leading to urgent needs for new memory architectures [12][18][121] - Marvell Technology is proposing solutions for memory architectures and optical interconnects to address these bottlenecks [12] - HBM is evolving beyond just memory technology to a deeply integrated system involving logic, memory, and packaging [13][58] Technological Advancements - The industry is moving towards a focus on "System Bandwidth Engineering," where electrical design at the packaging level is crucial for sustaining future performance scaling [91] - CXL (Compute Express Link) is enabling resource pooling and near-memory compute, which is essential for addressing memory allocation challenges [25][126] - Companies like Ayar Labs and Lightmatter are innovating in silicon photonics to achieve high bandwidth and low latency, reshaping memory systems [26] Strategic Implications - The year 2026 is identified as a critical inflection point for the AI industry, with expected breakthroughs in performance and systemic transformations across technology stacks and capital markets [18][39][55] - The shift from NVIDIA-centric control to a more distributed approach among cloud service providers (CSPs) is reshaping the HBM supply chain, with companies developing their own ASICs [23][57] - Geopolitical implications arise as U.S. companies strengthen ties with Korean memory suppliers, reducing reliance on Chinese supply chains [65] Future Outlook - By 2026, significant changes in pricing for electricity, water resources, and advanced packaging capacity are anticipated, with winners being those who can leverage bandwidth engineering for productivity [28][50] - The AI chip market is transitioning from a GPU-driven economy to a multi-chip, multi-architecture landscape, with emerging pricing power centers in Samsung and SK hynix [69][70] - The integration of HBM with advanced packaging technologies will be crucial for future AI architectures, with TSMC playing a pivotal role in this evolution [92][96] Conclusion - The AI industry is on the brink of a major transformation, driven by technological advancements, strategic shifts in supply chains, and the urgent need to address memory and bandwidth challenges. The developments leading up to 2026 will redefine the competitive landscape and the value chain within the AI ecosystem [39][70][71]