傅里叶的猫
Search documents
聊一聊目前主流的AI Networking方案
傅里叶的猫· 2025-06-16 13:04
Core Viewpoint - The article discusses the evolving landscape of AI networking, highlighting the challenges and opportunities presented by AI workloads that require fundamentally different networking architectures compared to traditional applications [2][3][6]. Group 1: AI Networking Challenges - AI workloads create unique demands on networking, requiring more resources and a different architecture than traditional data center networks, which are not designed for the collective communication patterns of AI [2][3]. - The performance requirements for AI training are extreme, with latency needs in microseconds rather than milliseconds, making traditional networking solutions inadequate [5][6]. - The bandwidth requirements for AI are exponentially increasing, creating a mismatch between AI demands and traditional network capabilities, which presents opportunities for companies that can adapt [6]. Group 2: Key Players in AI Networking - NVIDIA's acquisition of Mellanox Technologies for $7 billion was a strategic move to enhance its AI workload infrastructure by integrating high-performance networking capabilities [7][9]. - NVIDIA's AI networking solutions leverage three key innovations: NVLink for GPU-to-GPU communication, InfiniBand for low-latency cluster communication, and SHARP for reducing communication rounds in AI operations [11][12]. - Broadcom's dominance in the Ethernet switch market is challenged by the need for lower latency in AI workloads, leading to the development of Jericho3-AI, a solution designed specifically for AI [13][14]. Group 3: Competitive Dynamics - The competition between NVIDIA, Broadcom, and Arista highlights the tension between performance optimization and operational familiarity, with traditional network solutions struggling to meet the demands of AI workloads [16][24]. - Marvell and Credo Technologies play crucial supporting roles in AI networking, with Marvell focusing on DPU designs and Credo on optical signal processing technologies that could transform AI networking economics [17][19]. - Cisco's traditional networking solutions face challenges in adapting to AI workloads due to architectural mismatches, as their designs prioritize flexibility and security over the low latency required for AI [21][22]. Group 4: Future Disruptions - Potential disruptions in AI networking include the transition to optical interconnects, which could alleviate the limitations of copper interconnects, and the emergence of alternative AI architectures that may favor different networking solutions [30][31]. - The success of open standards like UCIe and CXL could enable interoperability among different vendor components, potentially reshaping the competitive landscape [31]. - The article emphasizes that companies must anticipate shifts in AI networking demands to remain competitive, as current optimizations may become constraints in the future [35][36].
国外大厂的HBM需求分析
傅里叶的猫· 2025-06-15 15:50
Core Viewpoint - The article discusses the projected growth in HBM (High Bandwidth Memory) consumption, particularly driven by major players like NVIDIA, AMD, Google, and AWS, highlighting the increasing demand for AI-related applications and the evolving product landscape. Group 1: HBM Consumption Projections - In 2024, overall HBM consumption is expected to reach 6.47 billion Gb, a year-on-year increase of 237.2%, with NVIDIA and AMD's GPUs accounting for 62% and 9% of the consumption, respectively [1] - By 2025, total HBM consumption is projected to rise to 16.97 billion Gb, reflecting a year-on-year growth of 162.2%, with NVIDIA, AMD, Google, AWS, and others contributing 70%, 7%, 10%, 8%, and 5% respectively [1] Group 2: NVIDIA's HBM Demand - NVIDIA's HBM demand for 2024 is estimated at 6.47 billion Gb, with a recent adjustment bringing the total capacity to 6.55 billion Gb [2] - In 2025, NVIDIA's HBM demand is expected to decrease to 2.53 billion Gb, with HBM3e 8hi and 12hi versions making up 36% and 64% of the demand, respectively [2] - Key suppliers for NVIDIA include Samsung and SK hynix, which play crucial roles in the HBM supply chain [2] Group 3: AMD's HBM Demand - AMD's HBM demand for 2025 is projected at 0.20 billion Gb for the MI300 series and 0.37 billion Gb for the higher-end MI350 series [3] - Specific models like MI300X and MI325 are designed to enhance storage density, with capacities reaching 192GB and 288GB, respectively [3] - AMD relies on SK hynix and Samsung for HBM3e 8hi and 12hi versions, which are vital for its production plans [3] Group 4: Google and AWS HBM Demand - Google's HBM demand for 2025 is expected to be 0.41 billion Gb, primarily driven by TPU v5 and v6 training needs [4] - AWS's HBM demand is estimated at 0.28 billion Gb, with Trainium v2 and v3 versions accounting for 0.20 billion Gb and 0.08 billion Gb, respectively [6] - Both companies utilize HBM configurations that enhance their AI training and inference capabilities, with a focus on reducing reliance on external suppliers [5][6] Group 5: Intel's HBM Demand - Intel's HBM demand is relatively small, accounting for about 10% of total demand in 2025, primarily focusing on HBM3e versions [7] - Key suppliers for Intel include SK hynix and Micron, with Intel exploring in-house chip development to reduce supply chain dependencies [7]
聊聊910D和920
傅里叶的猫· 2025-06-14 13:11
Core Insights - The article discusses the anticipated performance and market release timelines for the AI semiconductor products 910D and 920, highlighting their architectural improvements and expected advantages over previous models [1]. Group 1: 910D Overview - 910D is designed with four dies, an upgrade from the two dies in 910C, enhancing its performance to exceed that of the H100 [1]. - The expected market release for 910D is optimistic, with potential availability by Q2 2024 and no later than Q2 2026 [1]. - The performance of 910D is expected to support certain training applications, although its cost-effectiveness may decline for models exceeding 400 billion parameters due to weaker ecosystem capabilities [1]. Group 2: 920 Overview - The 920 will also have multiple versions, with the first version adopting a dual-die design and optimized processes, transitioning to a GPGPU architecture for better ecosystem integration with NVIDIA [1]. - The first batch of 920 chips is expected to be available by the end of 2025, with larger scale production anticipated around mid-2027 [1].
英伟达特供中国的B20/B40 spec分析
傅里叶的猫· 2025-06-14 13:11
Core Viewpoint - Nvidia's CEO Jensen Huang indicated that future forecasts will exclude the Chinese market, yet the significance of China to Nvidia remains critical, as evidenced by the emphasis on Huawei as a competitive threat [3] Group 1: Nvidia's Strategy in China - Nvidia is developing a new generation of chips for the Chinese market, based on the GB202 GPU architecture, with plans to launch these new processors as early as July 2024 [3] - The new chips will include two models, referred to as B20 and B40/B30, which may be marketed as variants of the RTX 6000 series to obscure their Blackwell lineage [4] - Recent U.S. export controls have imposed restrictions on memory bandwidth and interconnect speed, leading to the use of GDDR memory in the new chips instead of HBM memory [4] Group 2: Chip Specifications - The B20 chip will utilize Nvidia's ConnectX-8 for interconnect functionality, optimized for small-scale clusters with 8 to 16 cards, primarily for inference tasks [6] - The B30/B40 models will support NVLink interconnect but at reduced speeds compared to standard specifications, with expected bandwidth similar to the H20's 900Gbps [7] - Memory configurations for the new chips are anticipated to include 24GB, 36GB, and 48GB, with the 48GB option being the most likely [8] Group 3: Market Demand and Pricing - The new chips are expected to be priced between $6,500 and $8,000, significantly lower than the H20's price range of $10,000 to $12,000, which may drive sustained customer demand [9] - Full server configurations with these new chips are estimated to cost between $80,000 and $100,000, depending on the connectivity options [9] Group 4: Customer Interest and Market Dynamics - Major Chinese tech companies have shown varying interest in the new chip models, with Tencent favoring the B20 for its cost-effectiveness in inference tasks, while ByteDance is more interested in the B30 and B40 to meet market demand left by the H20's discontinuation [10][11] - Alibaba has not specified a preference for particular models but indicates a strong overall demand for the chips [11] Group 5: Current Situation and Challenges - The true test for Nvidia will come once major Chinese customers receive testing cards, as the evaluation process typically takes about a month before large orders can be placed [12] - Despite Huang's comments, the Chinese market remains a vital revenue source for Nvidia, and competitors like Huawei continue to advance their own R&D efforts [12]
稀土这张牌,中国还能打多久
傅里叶的猫· 2025-06-12 16:10
Core Viewpoint - The article discusses the strategic importance of rare earth elements in the context of US-China relations, emphasizing that China's control over rare earth resources gives it significant leverage in negotiations with the US and Europe [1][2][10]. Historical Background and Policy Evolution - Rare earth elements, despite being labeled "rare," are not particularly scarce in the Earth's crust, with some like cerium being as abundant as copper [2]. - China dominated the global rare earth market in the late 20th century, capturing about 97% of the market share through low-cost exports, which led to environmental issues [2][3]. - Since 1998, China has gradually tightened control over rare earth exports through quota systems to stabilize prices and reduce environmental damage [2][3]. Current Control Mechanism - China's rare earth industry is strictly regulated through a quota system, which is adjusted annually based on global demand to maintain supply-demand balance and price stability [5]. - The state-owned enterprises, China Rare Earth (Southern) and Northern Rare Earth, control the entire production chain from mining to final product [5]. Export Restrictions and Strategic Impact - Starting April 2024, China will implement a licensing system for the export of seven heavy rare earth elements, which are crucial for both civilian and military applications [6]. - The approval process for export licenses takes about 45 working days, causing potential production disruptions for overseas manufacturers [6][10]. - China allows the export of finished products containing rare earth magnets, providing a workaround for foreign manufacturers to access rare earth materials indirectly [6]. Challenges in Replacing Chinese Supply - Despite the abundance of rare earth elements, the extraction and processing are environmentally challenging, leading many countries to avoid developing their resources [8][9]. - New mining projects typically take 3 to 5 years to develop, with significant delays in regions with strict environmental regulations [8]. - China currently holds a 90% market share in rare earth magnets, making it difficult for other countries to compete due to technological and scale disadvantages [9]. Long-term Strategic Implications - China's control over rare earth resources provides it with a strong geopolitical leverage, but this leverage may diminish over time as other countries invest in alternative supply chains [10][11]. - The ongoing supply chain disruptions have significantly impacted industries reliant on rare earth elements, highlighting their critical role in modern technology [11]. - The article contrasts China's strategic management of rare earth resources with the US's challenges in regulating high-end chip exports, suggesting that China's approach may offer more effective control [11].
昇腾910系列全年出货量下调
傅里叶的猫· 2025-06-11 11:31
随着美国针对GPU的限制越来越多,大家对国产GPU尤其是华为昇腾系列GPU的表现和出货量都非 常关心,前段时间老黄还说华为芯片性能已超H200,华为的CloudMatrix云架构也已经超英伟达。当 然老黄这么说肯定是有夸张的成分,他一直对美国政府出台的这些限制政策很不爽,既让英伟达损 失了很多中国的大客户,又让国产GPU在这段时间取得了突飞猛进的发展。 由于华为从未对外公布过昇腾系列的出货量,因此我们只能参考第三方的调研,本文中关于昇腾910 系列出货量的数据,是参考自本营最近的一份调研纪要。 国产GPU采购情况 相信本文的读者们应该都看到过关于CSP大厂的资本开支,字节预计2025年投入1600亿,阿里宣布 未来3年投入3800亿建设云和AI硬件基础设施,腾讯预计2025投入900亿。根据公众号"AI半导体专 研"的一个调研,在H20被禁后,国内大厂对于国产GPU卡的态度和采购计划如下: 字节最为积极,主要采用寒武纪和昇腾,并广泛测试国内其他品牌,2025年计划增加采购这些品牌 的卡。 阿里则较为谨慎,除了自研的平头哥外,还上线了部分昇腾卡,但性能表现不佳;2025年预计首次 采购海光和寒武纪的卡,同时继续 ...
外资顶尖投行研报分享
傅里叶的猫· 2025-06-11 11:31
还有专注于半导体行业分析的SemiAnalysis的全部分析报告: 星球中每日还会更新Seeking Alpha、Substack、 stratechery的精选付费文章, 现在星球中领券后只需要 390元,即可每天都能看到上百篇外资顶尖投行科技行业的分析报告和每天的精选报告,无论是我们自 己做投资,还是对行业有更深入的研究,都是非常值得的。 想要看外资研报的同学,给大家推荐一个星球,在星球中每天都会上传几百篇外资顶尖投行的原文研 报:大摩、小摩、UBS、高盛、Jefferies、HSBC、花旗、BARCLAYS 等。 ...
外资顶尖投行研报分享
傅里叶的猫· 2025-06-10 14:13
星球中每日还会更新Seeking Alpha、Substack、 stratechery的精选付费文章, 现在星球中领券后只需要 390元,即可每天都能看到上百篇外资顶尖投行科技行业的分析报告和每天的精选报告,无论是我们自 己做投资,还是对行业有更深入的研究,都是非常值得的。 还有专注于半导体行业分析的SemiAnalysis的全部分析报告: 想要看外资研报的同学,给大家推荐一个星球,在星球中每天都会上传几百篇外资顶尖投行的原文研 报:大摩、小摩、UBS、高盛、Jefferies、HSBC、花旗、BARCLAYS 等。 ...
摩根士丹利:英伟达NVL72出货量
傅里叶的猫· 2025-06-10 14:13
Core Viewpoint - The report from Morgan Stanley highlights a significant increase in the global production of GB200 NVL72 racks, driven by the surging demand for AI computing, particularly in cloud computing and data center sectors [1][2]. Group 1: Production Forecast - The global total production of GB200 NVL72 racks is estimated to reach 2,000 to 2,500 units by May 2025, a notable increase from the previous estimate of 1,000 to 1,500 units in April [1]. - The overall production for the second quarter is expected to reach 5,000 to 6,000 units, indicating a robust supply chain response to market demand [1]. Group 2: Company Performance - Quanta shipped approximately 400 GB200 racks in May, a slight increase from 300 to 400 units in April, with monthly revenue reaching about 160 billion New Taiwan Dollars, a year-on-year increase of 58% [2]. - Wistron demonstrated a strong growth trajectory, shipping around 900 to 1,000 GB200 computing trays in May, a nearly sixfold increase from 150 units in April, with revenue growth of 162%, reaching 208.406 billion New Taiwan Dollars [2]. - Hon Hai shipped nearly 1,000 GB200 racks in May, with a forecast of delivering 3,000 to 4,000 racks in the second quarter, despite some decline in its cloud and networking business due to traditional server shipment slowdowns [2]. Group 3: Market Dynamics - The actual delivery volume of GB200 racks may be lower than the reported shipment figures due to the need for further assembly of Wistron's L10 computing trays into complete L11 racks, which involves additional testing and integration time [3]. - Morgan Stanley ranks the preference for downstream AI server manufacturers as Giga-Byte, Hon Hai, Quanta, Wistron, and Wiwynn, with Giga-Byte being favored for its potential in GPU demand and the server market [3]. - A report from Tianfeng Securities indicates that major hyperscale cloud providers are deploying nearly 1,000 NVL72 cabinets weekly, with the shipment pace continuing to accelerate [3].
从CoreWeave视角看算力租赁行业
傅里叶的猫· 2025-06-09 13:40
Core Viewpoints - The article discusses the rapid growth and potential of the computing power leasing industry, particularly through the lens of CoreWeave, a significant player in this sector [2][11]. Company Overview - CoreWeave was established in 2017, originally as a cryptocurrency mining company, and has since pivoted to focus on AI cloud and infrastructure services, operating 32 data centers by the end of 2024 [2][3]. - The company has deployed over 250,000 GPUs, primarily NVIDIA products, and is a key provider of high-performance infrastructure services [2][3]. Business Model - CoreWeave offers three main services: bare-metal GPU leasing, management software services, and application services, with a focus on GPU leasing as the core offering [3][4]. - Revenue is generated primarily through two models: commitment contracts (96% of revenue) and on-demand payment, allowing flexibility for clients [4][5]. Financial Performance - In 2024, CoreWeave's revenue reached $1.915 billion, a year-over-year increase of over seven times, with Q1 2025 revenue at $982 million, reflecting a fourfold increase [8][9]. - The company has a remaining performance obligation of $15.1 billion, indicating strong future revenue potential [8]. Competitive Advantages - CoreWeave has optimized GPU utilization rates and efficiency, achieving significant performance improvements in AI training and inference tasks [7]. - The company has established strong relationships with NVIDIA, ensuring priority access to cutting-edge chips and technology [6][7]. Market Outlook - The AI infrastructure market is projected to grow from $79 billion in 2023 to $399 billion by 2028, with a compound annual growth rate of 38%, highlighting the industry's potential [11]. - The computing power leasing sector is expected to play a crucial role in the digital economy, driven by increasing demand for AI capabilities [11][14]. Future Growth Strategies - CoreWeave plans to expand its customer base, explore new industries, and enhance vertical integration with strategic partnerships [10]. - The management aims to leverage existing contracts and maintain a low leverage asset structure to support growth [10].