计算-存储融合
Search documents
突破“存储墙”,三路并进
3 6 Ke· 2025-12-31 03:35
Core Insights - The explosive growth of AI and high-performance computing is driving an exponential increase in computing demand, leading to a significant challenge known as the "storage wall" [1][2] - The competition for AI and high-performance computing chips will focus not only on transistor density and frequency but also on memory subsystem performance, energy efficiency, and integration innovation [1][4] Group 1: AI and Computing Demand - The evolution of AI models has led to a dramatic increase in computational requirements, with model parameters rising from millions to trillions, resulting in a training computation increase of over 10^18 times in the past 70 years [2][4] - The growth rate of computational performance has significantly outpaced that of memory bandwidth, creating a "bandwidth wall" that limits overall system performance [4][7] Group 2: Memory Technology Challenges - The traditional memory technologies are struggling to meet the unprecedented demands for performance, power consumption, and area (PPA) from various applications, including large language models and edge devices [1][4] - The average growth of DRAM bandwidth over the past 20 years has only been 100 times, compared to a 60,000 times increase in hardware peak floating-point performance [4][7] Group 3: TSMC's Strategic Insights - TSMC emphasizes that the future evolution of memory technology will revolve around "storage-compute synergy," transitioning from traditional on-chip caches to integrated memory solutions that enhance performance and energy efficiency [7][11] - TSMC is focusing on optimizing embedded memory technologies such as SRAM, MRAM, and DCiM to address the challenges posed by AI and HPC demands [11][33] Group 4: SRAM Technology - SRAM is identified as a key technology for high-speed embedded memory, offering low latency, high bandwidth, and low power consumption, making it essential for various high-performance chips [12][16] - The area scaling of SRAM is critical for optimizing chip performance, but it faces challenges as technology nodes advance to 2nm [12][17] Group 5: Computing-in-Memory (CIM) - CIM architecture represents a revolutionary approach that integrates computing capabilities directly into memory arrays, significantly reducing energy consumption and latency associated with data movement [21][24] - TSMC believes that DCiM (Digital Computing-in-Memory) has greater potential than ACiM (Analog Computing-in-Memory) due to its compatibility with advanced processes and flexibility in precision control [26][28] Group 6: MRAM Technology - MRAM is emerging as a viable alternative to traditional embedded flash memory, offering non-volatility, high reliability, and durability, making it suitable for applications in automotive electronics and edge AI [33][35] - TSMC's N16 FinFET embedded MRAM technology meets stringent automotive requirements, showcasing its potential in high-performance applications [39][49] Group 7: System-Level Integration - TSMC advocates for a system-level approach to memory technology breakthroughs, emphasizing the need for 3D packaging and chiplet integration to achieve high bandwidth and low latency [50][54] - The future of AI chips may see a blurring of boundaries between memory and computation, with innovations in 3D stacking and integrated voltage regulators enhancing overall system performance [60][61] Group 8: Future Outlook - The future of storage technology in AI computing is characterized by a comprehensive innovation revolution, with TSMC's roadmap focusing on SRAM, MRAM, and DCiM to overcome the "bandwidth wall" and energy efficiency challenges [62] - The ability to achieve full-stack optimization from transistors to systems will be crucial for leading the next era of AI computing [62]
突破“存储墙”,三路并进
半导体行业观察· 2025-12-31 01:40
Core Viewpoint - The article discusses the exponential growth of AI and high-performance computing, highlighting the emerging challenge of the "storage wall" that limits the performance of AI chips due to inadequate memory bandwidth and efficiency [1][2]. Group 1: AI and Storage Demand - The evolution of AI models has led to a dramatic increase in computational demands, with model parameters rising from millions to trillions, resulting in a training computation increase of over 10^18 times in the past 70 years [2]. - The performance of any computing system is determined by its peak computing power and memory bandwidth, leading to a significant imbalance where hardware peak floating-point performance has increased 60,000 times over the past 20 years, while DRAM bandwidth has only increased 100 times [5][8]. Group 2: Memory Technology Challenges - The rapid growth in computational performance has not been matched by memory bandwidth improvements, creating a "bandwidth wall" that restricts overall system performance [5][8]. - AI inference scenarios are particularly affected, with memory bandwidth becoming a major bottleneck, leading to idle computational resources as they wait for data [8]. Group 3: Future Directions in Memory Technology - TSMC emphasizes that the evolution of memory technology in the AI and HPC era requires a comprehensive optimization across materials, processes, architectures, and packaging [12]. - The future of memory architecture will focus on "storage-compute synergy," transitioning from traditional on-chip caches to integrated memory solutions that enhance performance and efficiency [12][10]. Group 4: SRAM as a Key Technology - SRAM is identified as a critical technology for high-performance embedded memory due to its low latency, high bandwidth, and energy efficiency, widely used in various high-performance chips [13][20]. - TSMC's SRAM technology has evolved through various process nodes, with ongoing innovations aimed at improving density and efficiency [14][22]. Group 5: Computing-in-Memory (CIM) Innovations - CIM architecture represents a revolutionary approach that integrates computing capabilities directly within memory arrays, significantly reducing data movement and energy consumption [23][26]. - TSMC believes that Digital Computing-in-Memory (DCiM) has greater potential than Analog Computing-in-Memory (ACiM) due to its compatibility with advanced processes and flexibility in precision control [28][30]. Group 6: MRAM Developments - MRAM is emerging as a viable alternative to traditional embedded flash memory, offering non-volatility, high reliability, and durability, making it suitable for applications in automotive electronics and edge AI [35][38]. - TSMC's MRAM technology meets stringent automotive requirements, providing robust performance and longevity [41][43]. Group 7: System-Level Integration - TSMC advocates for a system-level approach to memory and compute integration, utilizing advanced packaging technologies like 2.5D/3D integration to enhance bandwidth and reduce latency [50][52]. - The future of AI chips may see a blurring of the lines between memory and compute, with tightly integrated architectures that optimize energy efficiency and performance [58][60].