内存墙
Search documents
SRAM,更难了
半导体行业观察· 2026-03-27 00:52
Core Viewpoint - SRAM is a critical component in all computing systems, but it has failed to keep pace with the expansion of logic circuits, leading to increasingly challenging issues, particularly over the past five years [1][4]. Group 1: Memory Wall and SRAM Challenges - The concept of the "memory wall" was identified as a key bottleneck for future processing capabilities, with memory capacity and performance becoming critical issues [1][4]. - SRAM's capacity and performance improvements have stagnated, resulting in a higher proportion of chip area being occupied by SRAM as process nodes shrink, leading to increased reliance on slower external memory [4][8]. - The performance of processors is often limited by memory and memory bandwidth rather than computational power, with many processors operating at only 20% utilization [7][9]. Group 2: Technological Limitations - Traditional 6T SRAM cells have reached physical and process deviation limits, hindering further miniaturization and performance improvements [8]. - As process nodes shrink, factors such as electrostatic control and random fluctuations become significant constraints, limiting SRAM density improvements to less than 15% at advanced 2nm nodes, compared to 50%-100% improvements seen in earlier nodes [8][9]. - The gap between memory density growth and logic density growth has been widening since the 1980s, with current computer performance improvements not matching memory bandwidth enhancements [9]. Group 3: Software Implications - The reliance on large local SRAM and multi-layer caches in processor architectures is increasingly challenged, as SRAM occupies a larger proportion of chip area and cost [11]. - Software must adapt to a more complex memory hierarchy, with locality, partitioning, and predictability becoming critical for system-level performance [11][12]. - AI models are particularly affected, as memory bandwidth and on-chip cache become performance bottlenecks, necessitating optimizations in data locality and memory-aware scheduling [12]. Group 4: Alternative Solutions - The industry is exploring 3D stacking technologies and chiplet designs to address SRAM limitations, allowing for higher bandwidth and lower power consumption [13][17]. - Emerging memory technologies like MRAM and ReRAM are gaining traction, offering scalability and cost advantages, but they are not expected to fully replace SRAM [15][16]. - The concept of memory computing or near-memory computing is evolving, indicating a shift in traditional models as SRAM scalability issues become more pronounced [15]. Conclusion - The memory bottleneck is becoming increasingly evident, with little sign of change in the short term. The expansion of SRAM is unlikely to return to previous levels, necessitating the search for alternative solutions and more efficient utilization of existing memory [18].
国产算力破局!资本狂砸 3D 芯片!市场空间多大?
是说芯语· 2026-03-02 12:54
Core Viewpoint - The rapid financing of Sanniao Technology, raising nearly 1 billion yuan in four months, signifies deep market recognition of 3D AI inference chip technology and highlights the transition of the global computing industry towards innovation leadership, driven by the explosive demand for AI inference power [1][17]. Group 1: 3D Chip Technology and Market Dynamics - The "memory wall" has long been a core bottleneck limiting AI computing power, with chip computing capabilities increasing 60,000 times over the past 20 years, while memory bandwidth has only increased 100 times [2]. - 3D IC technology, utilizing techniques like silicon vias and hybrid bonding, allows for vertical stacking of storage and computing chips, breaking through bandwidth limits and achieving high memory bandwidth and integration density [3]. - Sanniao Technology's 3D TokenPU architecture boasts a 3D DRAM bandwidth of 32TB/s, four times that of NVIDIA's B200, and its first chip, A4, achieves inference throughput 1.26-2.19 times that of NVIDIA's H200, while being 30% cheaper and maintaining a gross margin over 60% [3]. Group 2: Market Growth and Policy Support - The rise of 3D chips is driven by the dual forces of advanced packaging industry growth and AI computing demand, with the global advanced packaging market expected to reach approximately $57.1 billion by 2025 and $78.6 billion by 2028, growing at a CAGR of 11.24% [7]. - The 2.5D/3D packaging technology is projected to grow at a CAGR of 18.7% from 2022 to 2028, increasing its market share from 21% to 33% [7]. - Domestic policies supporting AI chip development, such as investment subsidies of up to 30%, further enhance the growth prospects for 3D chips, aligning with national strategies for self-controlled AI technology [9]. Group 3: Competitive Landscape and Domestic Advantages - The competition in the 3D chip sector is characterized by international giants like NVIDIA and AMD focusing on technology iteration, while domestic companies like Sanniao Technology leverage localized scene adaptation and a fully domestic supply chain to gain competitive advantages [10][12]. - Domestic companies are able to meet diverse AI application needs, from internet data processing to edge computing, by providing customized solutions that significantly lower R&D and production costs [15]. - The collaboration between design and packaging companies in China demonstrates a complete closed-loop capability from concept to mass production, indicating a rapid maturation of the domestic 3D chip industry [14]. Group 4: Future Directions and Industry Outlook - The future of 3D chip development will focus on technological iteration, mass production capabilities, and ecosystem building, with an emphasis on higher bandwidth density, lower latency, and larger storage capacity [16]. - The successful financing of Sanniao Technology marks a pivotal moment for the 3D chip industry, as the continued demand for AI inference power is expected to significantly reduce costs and position China as a leader in the global computing landscape [17].
一家水下AI芯片公司完成10亿元融资,瞄准大模型推理
暗涌Waves· 2026-02-13 00:57
Core Viewpoint - The article discusses the rapid development and funding of a 3D AI chip company, 算苗科技 (Suanmiao Technology), which has completed two rounds of financing totaling nearly 1 billion RMB, aimed at developing domestically produced 3D computing chips for AI applications [3][10]. Group 1: Company Overview - 算苗科技 focuses on the research and development of 3D computing chips, with its core product being a customized chip for AI model inference [4]. - The company aims to address the "memory wall" issue that limits AI model computation, as current AI chips face significant inefficiencies due to memory bandwidth constraints [4][5]. - 算苗科技's A4 chip has demonstrated a throughput of 1.26 to 2.19 times that of NVIDIA's H200 in inference tasks on major open-source models [5]. Group 2: Funding and Market Position - The recent funding rounds were led by prominent investors, including Source Code Capital and Shixi Capital, indicating strong market interest and support for the company's vision [3][10]. - The company is positioned to leverage its expertise in 3D IC technology to create a competitive edge in the AI chip market, which is expected to grow significantly [10][19]. Group 3: Technological Innovation - 算苗科技 utilizes a 3D stacked architecture that allows for significantly higher memory bandwidth (up to 32 TB/s), which is crucial for AI model inference [4][13]. - The company’s approach contrasts with traditional GPU architectures, focusing on specialized ASIC designs that optimize performance for specific tasks rather than general-purpose computing [14][15]. Group 4: Strategic Focus - The company has chosen to concentrate on AI model inference rather than training, as it anticipates that 90% of future AI computing demand will be for inference tasks [15][18]. - 算苗科技 believes that the future of AI computing lies in architectural innovation, particularly through 3D stacking and ASIC optimization, which aligns with the growing demand for efficient computing solutions [28][29].
一家光芯片公司,获2.2亿美元融资
半导体芯闻· 2026-02-12 10:37
Core Insights - Olix Computing, a UK-based startup, is developing AI chips with integrated optical components and has raised $220 million in funding, achieving a valuation of over $1 billion [1][4] - The company's chips are optimized for AI inference, addressing the "memory wall" issue associated with external memory HBM [2][3] Group 1: Funding and Valuation - Olix Computing has completed a funding round led by Hummingbird Ventures, raising $220 million [1] - The company is now valued at over $1 billion following this funding round [1] Group 2: Technology and Innovation - Olix's chips utilize an "innovative memory and interconnect architecture," indicating the use of photonic components for data transmission within the processor [1] - The optical interconnect technology offers higher bandwidth and lower power consumption compared to traditional electrical signals [2] - Olix's design avoids using HBM, opting instead for SRAM, which is faster and integrated within the AI chip, reducing data transmission latency [2][3] Group 3: Product Development and Market Strategy - The chip, named Optical Tensor Processing Unit (OTPU), is designed to optimize tensor operations, which are crucial for AI models [3] - Olix is also developing a compiler to adapt existing AI models for its chips, with plans to deliver OTPU chips to customers starting next year [4]
光子AI芯片初创公司Olix获得2.2亿美元投资
Sou Hu Cai Jing· 2026-02-12 09:16
Core Insights - Olix Computing Ltd. has developed an AI chip with integrated optical components and recently secured $220 million in funding, raising its valuation to over $1 billion [2] - The chip is optimized for inference tasks and utilizes a novel storage and interconnect architecture, leveraging photonic components for data transmission [2][3] - The company aims to address the "memory wall" challenge by using SRAM instead of HBM, which allows for faster data transfer and improved performance [3] Funding and Valuation - The recent funding round was led by Hummingbird Ventures, with previous investments from Plural, Vertex Ventures, LocalGlobe, and Entrepreneurs First [2] - The valuation of Olix has surpassed $1 billion following this funding [2] Technology and Innovation - Olix's chip, named OLIX Optical Tensor Processing Unit (OTPU), is designed to handle tensors, which are mathematical objects used in AI models [4] - The chip may also include circuits optimized for workloads beyond tensors, similar to Google's tensor processing units [4] - The use of photonic technology is expected to provide higher throughput and lower power consumption compared to traditional silicon SRAM architectures [3][4] Market Readiness - Olix plans to begin shipping OTPU chips to customers next year and is developing a compiler to adapt existing AI models for its chip [5]
DRAM危机,短期无解
半导体行业观察· 2026-02-11 01:27
Core Insights - The current surge in demand for DRAM memory is primarily driven by the needs of artificial intelligence (AI) data centers, leading to a significant price increase of 80% to 90% in DRAM prices this quarter [2] - The ongoing supply shortage is a result of the cyclical nature of the DRAM industry, exacerbated by the rapid expansion of AI hardware infrastructure [2][8] - The introduction of High Bandwidth Memory (HBM) technology is crucial for meeting the demands of AI applications, but it comes with high costs, often three times that of other memory types [6][14] Group 1: Supply and Demand Dynamics - The DRAM industry is characterized by cycles of boom and bust, with significant capital investment required for new wafer fabs, which can cost over $15 billion and take 18 months or more to become operational [8] - The COVID-19 pandemic triggered a supply panic, leading major data center operators to stockpile memory and storage devices, which initially drove prices up [8] - As demand stabilized and data center expansion slowed in 2022, prices plummeted, prompting major companies like Samsung to cut production by 50% to prevent prices from falling below manufacturing costs [8][9] Group 2: AI Data Center Growth - There is a stark contrast between the lack of new investments in memory production and the surge in demand for new data centers, with nearly 2,000 new data centers planned or under construction globally [12] - McKinsey predicts that by 2030, companies will invest $7 trillion in data center construction, with $5.2 trillion allocated specifically for AI data centers [12] - NVIDIA has emerged as the biggest beneficiary of the AI data center boom, with its data center revenue skyrocketing from under $1 billion in Q4 2019 to $51 billion by Q4 2025 [12][14] Group 3: HBM Technology and Costs - HBM technology, which integrates multiple DRAM chips in a 3D stack, is essential for overcoming the "memory wall" that limits the performance of large language models [6][5] - The cost of HBM can account for 50% or more of the total cost of GPUs, making it a significant factor in the overall expense of AI hardware [6][14] - Micron forecasts that the HBM market will grow from $35 billion in 2025 to $100 billion by 2028, indicating a substantial increase in demand that will outstrip supply [14] Group 4: Future Supply Solutions - To address the DRAM supply issues, the industry is focusing on innovation and building more fabs, but these efforts will take time to impact prices [17] - Major players like Micron, Samsung, and SK Hynix are investing in new fabs, but these projects are unlikely to lower prices significantly in the near term [17][18] - Advanced packaging technologies and improved collaboration between memory suppliers and AI chip designers are seen as key to increasing supply efficiency [17]
这种芯片将突破内存壁垒
半导体行业观察· 2026-02-10 01:14
Core Viewpoint - Researchers at the University of California, San Diego have developed a new type of resistive random-access memory (RRAM) that can potentially overcome the "memory wall" in artificial intelligence by allowing computations to occur within the memory itself [2][3]. Group 1: RRAM Technology - Traditional RRAM relies on forming low-resistance filaments in a high-resistance dielectric environment, which requires high voltages and is prone to noise and randomness, making it unsuitable for integration in processors [3]. - The new RRAM design eliminates the need for filaments, allowing the entire layer's resistance to switch between high and low states, thus simplifying the manufacturing process and enhancing performance [3][4]. Group 2: Device Performance - The new RRAM devices have been scaled down to 40 nanometers and can be stacked up to eight layers, achieving 64 different resistance values with a single voltage pulse, which is a significant improvement over traditional filament-based RRAM [4]. - The resistance values of the new stacked units reach the megaohm level, which is beneficial for parallel computations, unlike traditional RRAM that is limited to kilohm levels [4]. Group 3: Application and Testing - The research team tested a 1-kilobyte array of the new RRAM using continuous learning algorithms, achieving a classification accuracy of 90% with data from wearable sensors, comparable to digital neural networks [5]. - The potential applications for this technology include neural network models on edge devices that require learning from their environment without cloud access [5]. Group 4: Challenges and Future Prospects - While the new RRAM shows promise for data retention at room temperature comparable to flash memory, its performance in high-temperature environments remains uncertain, posing a challenge for practical applications [5]. - If validated, this technology could address the growing memory bottleneck faced by large models in AI, enabling models to run directly in memory [6].
中国推理芯片突围与成本革命:破“内存墙”、兼容CUDA
2 1 Shi Ji Jing Ji Bao Dao· 2026-02-04 09:09
Core Insights - The article discusses the shift in the global AI computing power focus from training to inference, indicating a competitive landscape for cost-effective and energy-efficient chips [1][2] - The consensus in the industry is that inference chips will dominate AI evolution in the next five to ten years, with companies like Google and Nvidia leading the charge [1][3] - CloudWalk Technology has announced its strategic focus on AI inference chips, aiming to significantly reduce the cost of processing tokens, which are becoming a core productivity driver in the AI landscape [2][3] Industry Trends - The demand has shifted from relying on high-performance GPUs to a pressing need for high-cost performance inference chips [2] - The past year has seen a dramatic increase in the computational requirements for large models, with token processing needs growing hundreds of times, highlighting the importance of inference over training [2][3] - Nvidia's strategic acquisition of Groq's core assets for $20 billion reflects the growing importance of inference chips, with Groq's valuation skyrocketing from $7 billion to $20 billion in just four months [3] Company Strategy - CloudWalk Technology's CEO, Chen Ning, emphasizes the goal of reducing the cost of processing one million tokens by 100 times, aiming for a transformative impact on industrial productivity by 2030 [3][4] - The company is developing a new processor architecture, GPNPU, designed to optimize inference for large models while addressing cost, efficiency, and deployment challenges [5][6] - The GPNPU architecture aims to maintain compatibility with existing CUDA programs, lowering the barrier for integration into production systems [5][6] Product Development - CloudWalk Technology plans to launch the DeepVerse 100, 200, and 300 series chips over the next five years, targeting major clients across various industries [6] - The company is focusing on modular chip design through a "power building block" approach, allowing for scalable and flexible computing solutions [6] - The company has established a strong domestic production capacity, ensuring supply chain security for large-scale chip production and delivery [6]
100根内存条换一套房,AI疯狂吞噬全球内存,普通人电脑快买不起了
3 6 Ke· 2026-01-20 07:22
Core Insights - The tech industry is facing a significant crisis due to a "memory wall," which is limiting the growth of AI despite high expectations for computational power [1][2][10] - The demand for DRAM (Dynamic Random Access Memory) is expected to surge, with prices projected to increase by 88% in 2026, driven by the insatiable needs of AI data centers [2][8] - The current memory crisis is causing a supply shortage for consumer electronics, leading to higher prices for devices like computers and smartphones [5][10] Group 1: Memory Crisis and Price Surge - The price of DDR5 memory has increased by approximately 307% since September 2025, with high-capacity server memory modules reaching prices of 400,000 yuan for 100 units [6][8] - Citibank has revised its forecast for DRAM prices, predicting an increase of 88% in 2026, up from a previous estimate of 53% [2][8] - The demand from AI giants like OpenAI and Google is consuming a significant portion of memory production, leading to a scarcity in the consumer market [5][10] Group 2: Impact on Consumers and Market Dynamics - Consumers are experiencing higher prices for new computers and smartphones, with the market supply for consumer-grade memory dwindling [5][10] - PC vendors are prioritizing supply for large OEMs, resulting in reduced availability for third-party module manufacturers [10] - The ongoing memory crisis is perceived as a "resource tax" that consumers are forced to pay for the advancement of AI technology [10] Group 3: Technological Implications and Future Outlook - The growth of AI models is outpacing the advancements in memory bandwidth, creating a bottleneck that could hinder further AI development [13][14] - Innovations such as High Bandwidth Memory (HBM) and new architectures like CXL and PIM are being explored to overcome the memory wall [18][19] - The trend indicates that the era of affordable and abundant memory is coming to an end, with implications for both consumers and AI companies [19][20]
存储猛拉,AI存力超级周期到底有多神?
3 6 Ke· 2026-01-06 12:19
Core Insights - The storage industry is experiencing a significant upcycle driven by AI demand, extending from HBM to traditional storage sectors, with Micron's gross margin reaching a historical high of 66-68% for the next quarter, indicating a stronger cycle than previous ones [1][3]. Group 1: AI Demand and Storage Market Dynamics - The price increase of storage products reflects the supply-demand relationship in the market, primarily driven by AI server demand [3]. - The current AI storage cycle is characterized by a shift in focus from training to inference, leading to differentiated demands for "low latency, high capacity, and high bandwidth" storage [3][14]. - The three major manufacturers (Samsung, SK Hynix, Micron) are prioritizing capital expenditures towards HBM and DRAM, resulting in structural supply-demand imbalances and significant price increases [3][6]. Group 2: Role of Different Storage Types in AI Servers - HBM serves as the "performance ceiling" for AI servers, being a high-bandwidth, high-power product that directly impacts the model scale and response speed [11]. - DRAM (DDR5) acts as a data exchange hub, connecting HBM and NAND, and is crucial for handling concurrent tasks in AI servers [12]. - NAND (SSD) functions as a fast persistent layer for frequently accessed data, while HDD serves as a low-cost container for large volumes of cold data [12][14]. Group 3: Addressing the "Memory Wall" Challenge - The "memory wall" bottleneck arises from the disparity between computing speed and data transfer speed, leading to high GPU idle rates [5][16]. - Solutions to this issue include upgrading HBM to 16-Hi stacks to enhance bandwidth and implementing 3D stacked SRAM to reduce latency [18][19]. - The integration of computing capabilities within storage (compute-in-memory) is anticipated to be a long-term solution to the "memory wall" problem [21]. Group 4: HBM Market Supply and Demand - HBM demand is closely tied to AI chip shipments, with expectations for HBM supply to increase by over 60% by 2026 due to significant capital investments from the three major manufacturers [6][24]. - The combined monthly HBM production capacity of the three manufacturers is projected to rise from approximately 390,000 wafers to 510,000 wafers by the end of 2026, translating to an estimated supply of 41.9 billion GB of HBM [29][34]. - The HBM market is expected to be in a "tight balance" state in 2026, with demand estimated at around 42 billion GB, indicating a competitive landscape among manufacturers [39][40].