万卡集群
Search documents
瞄准英伟达,国产算力产业走向“闭环”
3 6 Ke· 2026-01-09 12:39
岁末年初,中国算力产业在资本领域的运作骤然提速。 1月8日,上海天数智芯半导体股份有限公司(下称"天数智芯",09903.HK)在港股挂牌上市,其此次 公开发售获超400倍认购,充分显示了资本市场的热情。 与GPU厂商的二级市场热潮相呼应,国产存储芯片厂商也在2025年末完成了关键一跃。2025年12月30 日,长鑫科技集团股份有限公司(以下简称"长鑫科技")正式向上交所递交科创板招股书,并披露了其 2025年前三季度320.84亿元的营收数据,这一数字直观展示了国产DRAM(动态随机存取存储器)产能 释放的规模。 更早之前的2025年9月25日,长江存储科技控股有限责任公司(以下简称"长存集团")完成了股份制改 革,其1600亿元的估值一度刷新了半导体独角兽的纪录,也标志着这家NAND Flash(闪存)龙头进入 了发展新阶段。 从芯片设计到存储颗粒,从科创板到港交所,一浪接一浪的资本热潮间接宣告了国产算力产业的快速进 步。当然,在资本市场的热度之外,国产芯片在实际的智算中心建设与应用适配中,仍面临着复杂考 验。 各司其职 此前不久,另外两家国产GPU头部企业也已接连登陆科创板:2025年12月5日,摩尔线 ...
瞄准英伟达!国产算力产业走向“闭环”
经济观察报· 2026-01-09 10:28
从芯片设计到存储颗粒,从科创板到港交所,一浪接一浪的资 本热潮间接宣告了国产算力产业的快速进步。当然,在资本市 场的热度之外,国产芯片在实际的智算中心建设与应用适配 中,仍面临着复杂考验。 作者:郑晨烨 封图:图虫创意 岁末年初,中国算力产业在资本领域的运作骤然提速。 1月8日,上海天数智芯半导体股份有限公司(下称"天数智芯",09903.HK)在港股挂牌上市, 其此次公开发售获超400倍认购,充分显示了资本市场的热情。 此 前 不 久 , 另 外 两 家 国 产 GPU 头 部 企 业 也 已 接 连 登 陆 科 创 板 : 2025 年 12 月 5 日 , 摩 尔 线 程 (688795.SH)上市首日股价一度上涨468.78%,总市值突破3055亿元;2025年12月17日, 沐曦股份(688802.SH)上市首日涨幅达到692.95%,市值站上3300亿元关口。 与GPU厂商的二级市场热潮相呼应,国产存储芯片厂商也在2025年末完成了关键一跃。2025年 12月30日,长鑫科技集团股份有限公司(以下简称"长鑫科技")正式向上交所递交科创板招股 书,并披露了其2025年前三季度320.84亿元的营收数 ...
国产算力产业走向“闭环”
Jing Ji Guan Cha Wang· 2026-01-09 08:41
经济观察报记者 郑晨烨 岁末年初,中国算力产业在资本领域的运作骤然提速。 1月8日,上海天数智芯半导体股份有限公司(下称"天数智芯",09903.HK)在港股挂牌上市,其此次 公开发售获超400倍认购,充分显示了资本市场的热情。 此前不久,另外两家国产GPU头部企业也已接连登陆科创板:2025年12月5日,摩尔线程 (688795.SH)上市首日股价一度上涨468.78%,总市值突破3055亿元;2025年12月17日,沐曦股份 (688802.SH)上市首日涨幅达到692.95%,市值站上3300亿元关口。 与GPU厂商的二级市场热潮相呼应,国产存储芯片厂商也在2025年末完成了关键一跃。2025年12月30 日,长鑫科技集团股份有限公司(以下简称"长鑫科技")正式向上交所递交科创板招股书,并披露了其 2025年前三季度320.84亿元的营收数据,这一数字直观展示了国产DRAM(动态随机存取存储器)产能 释放的规模。 更早之前的2025年9月25日,长江存储科技控股有限责任公司(以下简称"长存集团")完成了股份制改 革,其1600亿元的估值一度刷新了半导体独角兽的纪录,也标志着这家NAND Flash(闪存 ...
国产算力迈入“万卡”时代:摩尔线程发布新一代GPU架构,中科曙光发布万卡超集群
Jing Ji Guan Cha Wang· 2025-12-20 06:47
Core Insights - The article discusses the advancements in the domestic GPU industry, highlighting the launch of the "Huagang" architecture by Moore Threads and the "scaleX" supercluster system by Inspur, indicating a shift in focus from individual GPU performance to building scalable systems capable of handling massive computational tasks [2][6]. Group 1: Moore Threads Developments - Moore Threads unveiled its latest "Huagang" architecture, which boasts a 50% increase in computing density and a 10-fold improvement in efficiency compared to the previous generation [3]. - The "Huagang" architecture supports full precision calculations from FP4 to FP64 and introduces new support for MTFP6, MTFP4, and mixed low precision [3]. - Future chip plans include "Huashan," aimed at AI training and inference, and "Lushan," focused on high-performance graphics rendering, with "Lushan" showing a 64-fold increase in AI computing performance and a 50% improvement in ray tracing performance [4]. Group 2: Inspur Developments - Inspur's "scaleX" supercluster system, which publicly debuted, consists of 16 scaleX640 supernodes interconnected via the scaleFabric high-speed network, capable of deploying 10,240 AI accelerator cards [10]. - The scaleX system employs immersion phase change liquid cooling technology to address heat dissipation challenges, achieving a 20-fold increase in computing density per rack and a PUE (Power Usage Effectiveness) of 1.04 [11][12]. - The system supports multi-brand accelerator cards and has optimized compatibility with over 400 mainstream large models, reflecting a strategy to provide a versatile platform for various domestic computing resources [14]. Group 3: Industry Challenges and Solutions - The industry faces challenges in scaling up computational power, particularly in managing heat, power supply, and physical space limitations when deploying thousands of high-power chips in data centers [8][9]. - Both companies are addressing communication delays in distributed computing, with Moore Threads integrating a new asynchronous programming model and self-developed MTLink technology to support clusters exceeding 100,000 cards, while Inspur's scaleFabric network achieves 400 Gb/s bandwidth and sub-microsecond communication latency [12][13]. Group 4: Software Ecosystem and Compatibility - As the hardware specifications approach international standards, the focus is shifting towards optimizing the software stack, with Moore Threads announcing an upgrade to its MUSA unified architecture and achieving over 98% efficiency in core computing libraries [13]. - Inspur emphasizes the compatibility of its systems with various brands of accelerator cards, promoting an open architecture strategy that allows for coexistence of multiple chips [14].
超节点互连技术落地 国产万卡超集群首次真机亮相
2 1 Shi Ji Jing Ji Bao Dao· 2025-12-19 13:32
Core Insights - The article discusses the emergence of high-performance computing clusters, specifically the scaleX ultra-cluster developed by Sugon, which integrates 16 scaleX640 supernodes to achieve over 5 EFlops of computing power, marking a significant advancement in domestic AI computing infrastructure [4][5]. Group 1: Ultra-Cluster Development - The scaleX ultra-cluster is the world's first single-cabinet 640-card supernode, utilizing advanced technologies such as high-density blade servers and immersion cooling, resulting in a 20-fold increase in computing density and a PUE value as low as 1.04 [1][4]. - The scaleX ultra-cluster represents a shift from traditional scattered server deployments to a more integrated and efficient computing unit, showcasing the progress of domestic computing infrastructure from conceptual designs to tangible products [1][5]. Group 2: Demand for Computing Power - As mainstream AI models transition from hundreds of billions to trillions of parameters, the demand for computing power has surged, necessitating the development of EFLOPS-level and ten-thousand-card high-performance clusters as standard configurations for large models [2][3]. - The supernode architecture is becoming a preferred choice for new ten-thousand-card clusters due to its density and performance advantages, allowing for significant optimization in computing capabilities [3]. Group 3: Networking and Scalability - The scaleX ultra-cluster employs the scaleFabric high-speed network, which utilizes the first domestic 400G-class InfiniBand RDMA network cards, achieving 400 Gb/s bandwidth and under 1 microsecond communication latency, enhancing scalability to over 100,000 cards [7]. - The architecture allows for both Scale-up (vertical expansion) and Scale-out (horizontal expansion), addressing traditional communication bottlenecks and enabling the construction of large-scale intelligent computing clusters [6]. Group 4: Challenges and Considerations - The deployment of supernodes introduces systemic challenges, including heat dissipation from numerous chips, stability issues from mixed optical and copper interconnects, and reliability concerns from long-term operation of multiple components [8]. - As the scale of intelligent computing clusters expands, key challenges include ensuring scalability, reliability, and energy efficiency, necessitating breakthroughs in power supply technology and advanced software management for sustainable operation [8].
超节点互连技术落地,国产万卡超集群首次真机亮相
2 1 Shi Ji Jing Ji Bao Dao· 2025-12-19 13:24
Core Insights - The launch of the scaleX万卡超集群 marks the first physical appearance of a domestic万卡级 AI cluster system in China, showcasing significant advancements in AI computing capabilities [1][3] - The scaleX640 super node, part of the scaleX万卡超集群, integrates 16 super nodes and achieves a total computing power exceeding 5 EFLOPS, highlighting the growing demand for high-performance computing in AI applications [3][5] - The industry is transitioning from traditional server architectures to super node designs, which offer higher density and performance, becoming the preferred architecture for new万卡级 clusters [2][5] Company Developments - 中科曙光's scaleX640 super node is recognized as the world's first single cabinet-level 640-card super node, emphasizing the company's leadership in high-density computing solutions [2][3] - The scaleX万卡超集群 utilizes the scaleFabric high-speed network, which can achieve 400Gb/s bandwidth and less than 1 microsecond communication latency, significantly enhancing inter-node communication efficiency [7][8] - The company is addressing challenges related to system cooling, stability, and reliability as it scales up its super node architecture to meet the increasing demands of AI workloads [6][8] Industry Trends - The demand for computing power is rapidly increasing as AI models evolve from hundreds of billions to trillions of parameters, necessitating the development of万卡级 and beyond computing clusters [1][5] - Major international players like Meta, Microsoft, and OpenAI are also investing in the construction of 100,000-card clusters, indicating a global trend towards larger-scale AI computing infrastructures [6] - The industry is facing critical challenges in scalability, reliability, and energy efficiency as computing centers grow from megawatt to gigawatt levels, necessitating innovative power supply technologies and advanced management software [8]
TPU代工视角看谷歌材料
2025-12-01 00:49
Summary of Google Materials Conference Call Company and Industry Overview - The conference call focuses on Google and its developments in the GPU manufacturing sector, particularly in relation to its data centers and self-developed chips [1][2][3]. Key Points and Arguments Google’s Data Center Efficiency - Google achieved a reduction of approximately 25% in Power Usage Effectiveness (PUE) from 2020 to 2024 by optimizing power and thermal management through special IP [1][2]. - The company plans to transition to High Voltage Direct Current (HVDC) as a secondary power source in its data centers starting in 2026 [8]. Chip Development and Supply Chain - Google has partnered with MediaTek to design its self-developed chips, with the GPT-8 billion chip expected to launch in November 2026 [1][2]. - Major suppliers for Google’s chips include Broadcom and MediaTek, with potential for future suppliers to be introduced [1][2]. - From January 2024, Flex will join Google’s manufacturing chain, with a market share distribution of 80% for Google and 20% for Flex [3]. Changes in Supplier Dynamics - In the PCB supply chain, Google switched back to Huadian from its previous largest supplier, with current shares being 70% for Huadian, 20% for Fangzheng, and 10% for TTM [4]. - The light module supply chain remains dominated by Xuchuang, while New Yisheng holds less than 10% [4]. Cost Reduction Strategies - Google plans to switch to a combination of Active Optical Cables (AOC) and LPO in its switching components starting in 2026 to reduce costs, which will alter the existing supplier structure [4]. - The company is moving from traditional AEC cables to AOC cables, with major suppliers being Changxing Bochuang domestically and Finisar internationally [4]. Liquid Cooling Solutions - Liquid cooling solutions are becoming increasingly important in GPU manufacturing, especially due to leakage issues in NVIDIA's ecosystem [5][6]. - Google is implementing stricter standards for new suppliers to ensure reliability in liquid cooling systems [6]. Performance and Cost Comparison with NVIDIA - Google’s current performance is approximately 90% to 93% of NVIDIA's, allowing for a Total Cost of Ownership (TCO) reduction of about 44% [10]. - Investment costs for Google are estimated to be 40% to 45% lower than NVIDIA's, attributed to different product design philosophies [10]. Future Plans and Market Positioning - Google plans to commercialize its TPU hardware by 2026, with a gradual transition to a leasing model for its ecosystem [11]. - The company emphasizes a distributed, cloud-based, and virtualized design for its data centers, contrasting with NVIDIA's focus on centralized computing [11]. Supply Chain Management - Google employs a direct procurement model, minimizing costs by eliminating intermediaries, which allows for competitive pricing [16]. - The company’s strategy focuses on long-term revenue through cloud services rather than short-term profits from new product launches [16]. Challenges and Competitor Landscape - NVIDIA faces challenges in adapting to distributed deployments across multiple data centers, which may limit its market share in the cloud computing sector [22]. - Google’s self-developed chips are not significantly hindered by competitors using its hardware, as performance optimization requires software alignment with Google’s systems [25][26]. Additional Important Insights - Google is exploring partnerships with Intel to address chip supply issues using EMIB technology [21]. - The company anticipates producing 6.5 million chips in 2026, with a 30% increase planned for 2027, although actual production may fall short due to technological constraints [23].
规模超越英伟达,华为官宣“全球最强超节点+万卡算力新品”
Xuan Gu Bao· 2025-09-18 23:18
Group 1 - Huawei announced the launch of the Atlas 950 SuperPoD with a computing power scale of 8192 cards, expected to be released in Q4 2023, and the Atlas 960 SuperPoD with 15488 cards, expected in Q4 2027 [1] - The Atlas 950 SuperPoD's scale is 56.8 times larger than NVIDIA's NVL144, with total computing power 6.7 times greater, memory capacity 15 times larger at 1152TB, and interconnect bandwidth 62 times greater at 16.3PB/s [1] - Huawei has planned multiple Ascend chips for the next three years, including the 950PR and 950DT, with the 950PR set to launch in Q1 2026 and the 950DT in Q4 2026 [1] Group 2 - The "ten thousand card cluster" is seen as a key entry point in the current large model competition, with companies like Baidu, Alibaba, and Tencent already developing solutions for managing such clusters [2] - Guotai Junan Securities believes that domestic computing supernodes based on open architecture are expected to unify the domestic chip ecosystem and enhance cluster performance [2] - The HBM market is projected to reach $46 billion by 2026 and $98 billion by 2030, with a compound annual growth rate of 33% from 2024 to 2030 [2] Group 3 - Advanced Communication showcased its Ascend A800I A2 large model integrated machine at the World Artificial Intelligence Conference in Shanghai, designed specifically for generative large model scenarios [3] - Saiteng Co. has already delivered HBM equipment in bulk [4]