Workflow
傅里叶的猫
icon
Search documents
超节点技术与市场趋势解析
傅里叶的猫· 2025-09-28 16:00
Core Insights - The article discusses the collaboration and solutions in the supernode field, highlighting the major players and their respective strategies in the market [3][4]. Supernode Collaboration and Solutions - Major CSP manufacturers are seeking customized server cabinet products from server suppliers, with a focus on NV solutions [4]. - Key supernode solutions in China include Tencent's ETH-X, NV's NVL72, Huawei's Ascend CM384, and Alibaba's Panjiu, which are either being promoted or have existing customers [4]. - ByteDance is planning an Ethernet innovation solution for large models, primarily based on Broadcom's Tomahawk, but it has not yet been promoted [4]. - Tencent's ETH-X collaborates with Broadcom and Amphenol, utilizing Tomahawk switches and PCIe switches for GPU traffic management [5]. - The main applications of these solutions differ: CM384 focuses on training and large model computation, while ETH-X is more inclined towards inference [5]. Market Share and Supplier Landscape - The supernode solutions have not yet captured a significant market share, with traditional AI servers dominated by Inspur, H3C, and others [6]. - From September 16, CSPs including BAT were restricted from purchasing NV compliant cards, leading to a shift towards domestic cards, which are expected to reach 30%-40% in the coming years [6]. - The overseas market share for major internet companies like Alibaba and Tencent remains small, with ByteDance's overseas to domestic ratio projected to improve [6]. Vendor Competition and Second-Tier Landscape - Inspur remains competitive in terms of cost and pricing, while the competition for second and third places among suppliers is less clear [8]. - The second-tier internet companies have smaller demands, and mainstream suppliers are not actively participating in this segment [9]. - The article notes that the domestic AI ecosystem is lagging behind international developments, with significant advancements expected by 2027 [9][10]. Procurement and Self-Developed Chips - Tencent and Alibaba have shown a preference for NV cards when available, with a current ratio of NV to domestic cards at 3:7 for Alibaba and 7:3 for ByteDance [10]. - The trend towards supernodes is driven by the need for increased computing power and reduced latency, with expectations for large-scale demand in the future [10]. Economic and Technical Aspects - The article highlights the profit margins for AI servers, with major manufacturers achieving higher gross margins compared to general servers [11]. - The introduction of software solutions is expected to enhance profitability, with significant profit increases anticipated from supernode implementations [11].
阿里的磐久超节点和供应链
傅里叶的猫· 2025-09-27 10:14
昨天赶在阿里云栖大会的最后一天,特意从上海赶去杭州看了阿里的超节点,在现场拍了一个视频,但现场环境比较杂乱,人非常多。 | 42 | | | --- | --- | | ব J | | | 40 | | | 30 | | | 38 | | | 37 | ipmi0002 | | 36 | ipmi0001 | | 35 | | | 34 | 1U Power Shelf 33kW | | 33 | 1U Power Shelf 33kW | | 32 | 1U Compute Tray | | 31 | 1U Compute Tray | | 30 | 1U Compute Tray | | 29 | 1U Compute Tray | | 28 | 1U Compute Tray | | 27 | 1U Compute Tray | | 26 | 1U Compute Tray | | 25 | 1U Compute Tray | | 24 | 1U Compute Tray | | 23 | 1U Compute Tray | | 22 | 1U Non - Scalable NvSwitch5 Tray ...
微软的新液冷技术、阿里加大资本开支
傅里叶的猫· 2025-09-24 12:37
Group 1 - Microsoft's new microfluidic liquid cooling technology is a significant topic of discussion in the market, showcasing an aggressive approach to cooling solutions at the wafer level rather than just packaging [1][3] - Alibaba announced an increase in capital expenditure to 380 billion, indicating a strong trend towards investment in AI chips, particularly in light of Nvidia's 1 trillion impact [9][10] - The collaboration between Alibaba and Haiguang to establish a joint venture for a large-scale cluster with 110,000 computing chips marks a shift from business collaboration to capital binding [11] Group 2 - The penetration rate of AI chatbots is rapidly increasing, with global investments in AI reaching 400 billion in the past year and expected to exceed 4 trillion over the next five years, indicating strong capital inflow into the industry [12] - Haiguang's latest BW 1000 GPU achieves significant performance metrics, with FP64 performance at 30 TFLOPS and FP32 at 60 TFLOPS, positioning it competitively against Nvidia's H100 [13] - Haiguang's HSL technology aims to enhance ecosystem compatibility and improve CPU-GPU connection efficiency, potentially facilitating entry into the internet sector and establishing influence [14][15]
分析一下英伟达这1000亿的影响
傅里叶的猫· 2025-09-23 02:41
早上起来,市场已经炸锅了,英伟达要投1000亿美元给OpenAI。 越来越卷的AI行业 下面这个国外大厂的AI芯片的Roadmap,比国内还要激进,基本都是一年会出一到两个新的产品。 英伟达与OpenAI的这项高达1000亿美元的投资协议并非单纯的资金注入,而是通过逐步部署10吉瓦 AI数据中心的方式实现,首阶段将于2026年下半年上线,使用英伟达的Vera Rubin平台。 英伟达的投资动机 1、锁定客户需求与供应链主导权,OpenAI作为AI领域的领军者,英伟达通过投资确保OpenAI优先 使用其芯片,形成"资金循环":英伟达提供资金,OpenAI用于购买英伟达硬件。这不仅保证了英伟 达的销售需求,还防止OpenAI转向竞争对手,如Google的TPU或AMD的MI系列芯片。 这里的"资金闭环",网上有多种解释,无论是通过Oracle,还是通过微软和Coreweave,对英伟达和 OpenAI来说,都是有利的。 THE INFINITE MONEY GLITCH OpenAl $100 billion voilla oo ta 8100 billion NVIDIA. 2、这一合作标志着英伟达从芯片供应商 ...
存储市场上行趋势
傅里叶的猫· 2025-09-22 15:35
Core Viewpoint - The article discusses the recent price increases in the memory market, particularly in storage devices, driven by changes in supply and demand dynamics, with a notable focus on the impact of AI applications on demand growth [4][9][10]. Price Expectations - Recent price forecasts for the storage market have been revised upwards, with LPDDR5 contract prices expected to rise by 6-8%, LPDDR4 by 40-50%, and NAND Flash by 15%. Surprisingly, prices are expected to remain high even in the traditionally weak first quarter of 2026, indicating a significant shift in market supply-demand structure [4]. Supply Side Analysis - On the supply side, manufacturers are strategically shifting focus away from DDR4/LPDDR4 production towards higher-end products like DDR5 and HBM, leading to a reduction in DDR4/LPDDR4 capacity. High-end production capacity is fully utilized, while NAND capacity remains below 80% with no large-scale expansion plans, resulting in a severe supply elasticity issue [8]. Demand Side Analysis - The demand for storage devices is primarily driven by mobile phones, PCs, and servers, with servers accounting for about 30% of the demand. The shift in AI applications from training to inference is driving explosive growth in demand for LPDDR5x, DDR5, HBM, and enterprise SSDs [9][10]. Comparison with Previous Market Cycles - The current memory market cycle shows similarities to the 2016-2018 cycle, with both experiencing significant price surges and production cuts by major manufacturers. However, the underlying drivers differ, with the current cycle being fueled by structural demand from AI applications rather than just cyclical demand from smartphones and cloud computing [11][12]. Differences in Demand Drivers - The previous cycle was characterized by a general increase in demand due to smartphone upgrades and cloud computing, while the current cycle is driven by a structural and explosive demand from AI applications, which require higher performance storage solutions [13]. Differences in Supply Adjustment Logic - The previous supply adjustments were reactive and aimed at clearing inventory, while the current adjustments are proactive, with manufacturers permanently reallocating capacity to higher-margin products, leading to a long-term supply gap in traditional memory products [14]. Sustainability of the Current Market Cycle - The previous cycle's demand was closely tied to macroeconomic conditions and consumer electronics, leading to a decline as smartphone markets saturated. In contrast, the current demand is driven by the AI technology revolution, providing a more stable and long-term demand foundation [15]. Bernstein's Perspective - Bernstein highlights that the short-term price increases in NAND are driven by rising AI inference demand and HDD shortages, but expresses caution regarding the long-term outlook for NAND due to potential supply increases or demand decreases. In contrast, they maintain a more optimistic view on the prospects for DRAM and HBM [17]. NAND Market Dynamics - The short-term price increases in NAND are attributed to heightened AI inference demand and HDD shortages, with suppliers raising prices by 10%-30%. Bernstein anticipates a slight decline in ASP in 2025, followed by a 13% increase in 2026, but expects prices to drop in late 2026 as new supply comes online [18]. HBM and DRAM Market Outlook - Bernstein remains optimistic about the HBM and DRAM markets, predicting a 53% year-on-year increase in HBM shipments in 2026, with costs decreasing more than expected. Major suppliers are expected to benefit from market expansion despite competitive pressures [19].
周末谷歌OCS持续发酵
傅里叶的猫· 2025-09-21 12:05
Core Viewpoint - OCS (Optical Circuit Switch) technology is still in its early stages in the data center sector, with Google being the only company to achieve large-scale procurement so far. The technology is being explored by other major companies, indicating a growing interest and potential market expansion [5][7][10]. Summary by Sections OCS Development and Adoption - Google began exploring OCS technology in 2017-2018 and has now entered a phase of large-scale application, utilizing a 3D Torus network architecture to connect thousands of TPU units [6][7]. - Other major companies like Microsoft and NVIDIA are also testing OCS applications, although they have not yet reached the scale of Google [7][9]. Market Potential - The current OCS market is estimated to be around 6 billion USD with approximately 15,000 units in use. Projections suggest that by 2030, the market could exceed 20 billion USD with at least 50,000 units deployed [11][12]. - The demand for OCS technology is expected to grow significantly, particularly in AI supernode networks, which currently account for over 50% of OCS applications [18][19]. Technical Routes and Challenges - There are three main technical routes for OCS: MEMS, silicon-based liquid crystal, and piezoelectric ceramic, each with its own advantages and disadvantages [12][13][14]. - The MEMS solution is currently used by Google but has reliability concerns due to moving parts. The silicon-based liquid crystal solution is favored by NVIDIA and Microsoft for its high reliability and low cost [12][13]. Competitive Advantages - OCS offers high bandwidth, low latency, and low power consumption, making it suitable for specific applications like emergency network connections and DCI (Data Center Interconnect) [8][9][10]. - The technology's ability to create stable optical switching channels aligns well with the predictable traffic patterns in data centers, allowing it to replace traditional electrical switches in certain scenarios [10][11]. Future Outlook - The growth of OCS technology will depend on overcoming current limitations, such as increasing port numbers and reducing switching latency from milliseconds to microseconds or nanoseconds [18][19]. - The maturity of OCS vendors and their ability to provide reliable solutions will also play a crucial role in the technology's adoption and market growth [19].
聊一聊空心光纤
傅里叶的猫· 2025-09-20 11:26
以下文章来源于More Than Semi ,作者猫叔 More Than Semi . More Than SEMI 半导体行业研究 最近空芯光纤的热度不低,主要是两个催化剂。一个是博通发布的以太网白皮书,提到空芯AOC和 scale up场景的测试。另一个是英伟达的Spectrum-XGS以太网技术,提出了scale across方案,强调多 数据中心互联,核心瓶颈就是带宽和延迟。而空芯光纤正好是大带宽、低延迟的神器,完美契合需 求。 这篇文章,我们来深挖一下空心光纤。 光缆市场 空芯光纤的研发始于2016年英国南安普顿大学,到2019年进入试点。2022年,微软收购Lumenisity公 司,将其技术用于数据安全场景,如医疗中心和密钥防护。国内的长飞光纤光缆公司在这方面走在 了全球前列。2024年6月,长飞与中国电信合作建成620公里的传输线路,还与中国移动在深圳到东 莞建设了800G实验网,验证了空芯光纤在速率、损耗和衰减上的优异性能。目前,国内空芯光纤使 用量约1000芯公里,市场仍在快速扩展。 空芯光纤尤其适合AI数据中心、金融交易专线等对延迟和带宽要求极高的场景。比如,在AI训练的 超算中心,空 ...
光模块需求量和出货量
傅里叶的猫· 2025-09-18 11:15
Core Viewpoint - Huawei has launched new supernode products, significantly enhancing computing power and interconnect bandwidth, positioning itself as a leader in the AI chip industry [6][7][8]. Group 1: Huawei's New Products - The Atlas 950 supernode, based on the Ascend 950DT chip, supports 8192 Ascend 950DT chips, achieving a total computing power of 8E FLOPS for FP8 and 16E FLOPS for FP4, with an interconnect bandwidth of 16PB/s [7]. - The Atlas 960 supernode, based on the Ascend 960 chip, can support up to 15488 cards, with a total computing power of 30E FLOPS for FP8 and 60E FLOPS for FP4, and an interconnect bandwidth of 34PB/s [8]. - The Atlas 950 supernode is set to launch in Q4 2026, while the Atlas 960 is expected in Q4 2027, both significantly outperforming competitors like NVIDIA's upcoming products [7][8]. Group 2: Market Demand for Optical Modules - The demand for optical modules is projected to increase, with estimates for 2026 indicating a need for 3000-3200 million units, driven by major companies like Microsoft and NVIDIA [12]. - The 800G optical module market is expected to exceed expectations, particularly due to Microsoft's procurement strategies [12]. - The ratio of GPUs to optical modules varies by company, with NVIDIA at 1:3-1:4.5 and Google at approximately 1:14, indicating a growing need for optical modules in the industry [17]. Group 3: Key Suppliers and Market Dynamics - Major suppliers for optical modules include companies like 旭创 (Acacia), 菲尼萨 (Finisar), and 新易盛 (NewEase), with varying market shares across different clients [18]. - For 2026, the optimistic demand for 800G and 1.6T optical modules could reach nearly 50 million units, highlighting a potential supply gap [16]. - The competitive landscape shows that 旭创 is a dominant supplier for Google, while 新易盛 holds significant shares with AWS [18].
谷歌OCS(光交换机)的技术、发展、合作商与价值量拆解
傅里叶的猫· 2025-09-17 14:58
Core Insights - The article provides an in-depth analysis of Google's Optical Circuit Switch (OCS) technology, its components, and its implications for the industry, highlighting the potential for improved efficiency and reduced latency in data transmission [1] Group 1: Google's AI Momentum - Google's AI performance has been impressive, with the launch of Gemini 2.5 Flash Image leading to 23 million new users and over 500 million images generated within a month [2] - The company has released several multimodal model updates, showcasing its leadership in AI research and development [2] Group 2: OCS Technology Overview - OCS technology aims to eliminate multiple optical-electrical conversions in traditional networks, significantly enhancing efficiency and reducing latency [5][6] - The article discusses the differences between OCS and traditional electrical switches, emphasizing OCS's advantages in low latency and power consumption [14][16] Group 3: OCS Technical Solutions - The main OCS technologies include MEMS, DRC, and piezoelectric ceramic solutions, with MEMS being the dominant technology, accounting for over 70% of the market [10][12] - MEMS technology utilizes micro-mirrors to dynamically adjust light signal paths, while DRC offers lower power requirements and longer lifespan but slower switching speeds [10][12] Group 4: Performance and Application Differences - OCS is more suitable for stable traffic patterns where data paths do not need frequent adjustments, while traditional electrical switches excel in dynamic environments [14][30] - OCS can achieve approximately 30% cost savings over time due to its longevity and lower energy consumption, despite higher initial costs [16] Group 5: Key Components of OCS - The article details critical components of OCS, including laser injection modules and camera modules for real-time calibration, ensuring long-term stability [19][20] - Micro-lens arrays (MLA) are essential for stabilizing light signals, with increasing demand expected as OCS deployment grows [26][27] Group 6: CPO vs. OCS - CPO technology integrates switching chips and optical modules to reduce latency and power consumption, making it suitable for rapidly changing data flows [29][30] - OCS, on the other hand, is ideal for scenarios with predictable data flows, such as deep learning model training, where low latency and power efficiency are critical [30] Group 7: Google's OCS Implementation - Google employs a "self-design + outsourcing" model for its MEMS chips, ensuring compatibility with its OCS systems and optimizing performance parameters [31]
英伟达Rubin的液冷新方案?
傅里叶的猫· 2025-09-16 15:57
Core Viewpoint - The article discusses the recent high interest in NVIDIA's new liquid cooling solution, specifically the microchannel lid, and its implications for the semiconductor industry [2][4]. Group 1: Investment Bank Perspectives - JP Morgan and Morgan Stanley provided detailed analyses of the microchannel lid, highlighting its efficiency in heat dissipation compared to traditional cooling methods [5]. - The microchannel lid integrates a heat spreader and cold plate, allowing for efficient heat transfer and cooling, which is crucial as chip power requirements increase [8][11]. - The adoption of the microchannel lid could increase the number of quick disconnects (QD) in VR series compute trays to at least 12, compared to 8 in the existing GB300 compute trays [12]. - In the short term, the impact on liquid cooling suppliers is limited, as a significant portion of NVIDIA's GPU shipments will still use traditional cold plates [13]. - Currently, ODMs are in the testing phase for the microchannel lid, with a decision expected in one to two months [14]. Group 2: Industry Perspectives - The microchannel lid concept was discussed in the industry as early as late August, with market speculation about its potential use in NVIDIA's Rubin GPU [15]. - Jentech, a key supplier for NVIDIA's lid products, is closely tied to NVIDIA's technology iterations and order fluctuations, which can influence its stock performance [16]. - The maturity of different cooling technologies ranks single-phase cold plates as significantly ahead, followed by dual-phase cold plates and immersion cooling, with microchannel lids lagging behind [18]. - Cold plate suppliers like AVC indicated that the microchannel lid may not be adopted until the release of the Rubin Ultra model, as current production timelines do not support its implementation [18]. - Companies are currently sending samples for the microchannel lid, but sample approval does not guarantee immediate procurement [19]. - Key players in the lid and cold plate sectors, such as Jentech and AVC, are conducting advanced research on microchannel lids, but it remains uncertain which company will dominate the market [21]. - Besides microchannel lids, 3D printing is also emerging as a cutting-edge research direction in the cooling field, offering high precision and customization capabilities [21].