曙光scaleX万卡超集群系统
Search documents
万卡集群,郑州抢跑!国家超算互联网核心节点上线试运行
Sou Hu Cai Jing· 2026-02-06 12:06
转自:智通财经 2月5日,国家超算互联网核心节点上线试运行。其算力资源由曙光scaleX万卡超集群系统提供支撑,可 对外提供超3万卡的国产AI算力,是国家超算互联网平台上线以来接入的全国最大单体国产AI算力资源 池,可为万亿参数模型训练、高通量推理、AI for Science等大规模AI计算场景提供算力服务。 "超算互联网正在成为一个类京东、淘宝模式的算力应用商城。"国家高性能计算机工程技术研究中心副 主任曹振南如此介绍国家超算互联网平台的应用价值。截至2025年底,平台已服务100多万用户,应用 商品超7200多个,单日处理作业峰值103万个,迄今已累计支撑运行1.96亿次作业。" 据了解,郑州核心节点采用的曙光scaleX万卡超集群基于AI计算开放架构设计,可兼容CUDA等主流软 件生态,支持多品牌国产加速卡混合部署,具备向十万卡、百万卡规模的灵活扩展能力。 "郑州地处中原,作为核心节点具有显著的区位优势——与各重要中心城市之间的距离都相对较短。中 原地区还具备人才集聚优势,是推动中部崛起的重要战略支点。特别是近年来,郑州在数字经济与智慧 城市建设方面已积累了大量人才资源,随着核心节点上线试运行,这里有 ...
国家超算互联网核心节点上线试运行,托举中国AI算力应用关键一跃
Sou Hu Cai Jing· 2026-02-05 15:44
Core Insights - The National Supercomputing Internet Application Technology Conference marked the launch of the core node's trial operation, supported by various government agencies and experts in the field [1][3] - The newly launched computing resource pool, powered by the Dawning scaleX supercluster system, offers over 30,000 domestic AI computing units, making it the largest single domestic AI computing resource pool in the country [1][3] Group 1: National Supercomputing Internet Node - The trial operation of the national supercomputing internet core node addresses the critical bottleneck of insufficient computing resources, which has hindered industrial upgrades [3] - The Dawning scaleX supercluster is based on an open architecture for AI computing, compatible with mainstream software ecosystems, and supports mixed deployment of various domestic acceleration cards [3] - The national supercomputing internet aims to provide integrated computing resource scheduling and access to thousands of applications, enhancing the usability of Chinese AI computing for global users [3] Group 2: Regional Development and Innovation - The launch of the supercomputing internet core node signifies the emergence of a computing application hub in Central China, facilitating the integration of computing resources and application demands both nationally and globally [4] - The core node is expected to attract talent, data, and application scenarios, contributing to high-quality regional development [4] Group 3: Infrastructure and Application Development - The national supercomputing internet platform is transitioning to a phase of "building and using in parallel," promoting efficient and accessible computing services for various cutting-edge application scenarios [5] - The platform aims to serve over 1 million users by the end of 2025, with more than 7,300 application products and a peak daily processing capacity of 1.03 million jobs [5] - The supercomputing internet is becoming a core engine for activating industrial innovation, providing robust computing support for the "AI+" initiatives across various sectors [7]
国产算力再传喜讯!超算互联网落子破局算力围堵
Xin Lang Cai Jing· 2026-02-05 12:18
(来源:科技旋涡) 国家超算互联网迈入规模化新阶段 近日,国家超算互联网应用技术大会暨核心节点上线试运行仪式在郑州举行。 据悉,该节点算力资源由曙光scaleX万卡超集群系统提供,可对外输出超3万卡的国产AI算力,这也是国家超算互联网平台上线以来接入的全国最大单体 国产AI算力资源池。 相较于算力规模体量的突破,该节点带来的超算互联网运营模式更让AI用户直接受益。 国家高性能计算机工程技术研究中心副主任曹振南介绍称,国家超算互联网平台旨在为各类前沿应用场景带来高效普惠的计算服务,让算力像水电一样自 由流通、高效易用。 业内有媒体指出,近年来,国产AI计算产业面临两个极端: · 一边是AI模型需求爆炸式增长 基于此,该节点采用的曙光scaleX万卡超集群完全基于AI计算开放架构设计,可全面兼容CUDA等主流软件生态,支持多品牌国产加速卡混合部署。 这也让其不仅具备了向十万卡、百万卡规模的灵活扩展能,还能广泛接入上下游,实现一体化算力统筹调度。 另外,国家超算互联网平台通过"算力+应用"一体化服务,正在向类京东、淘宝模式的算力应用商城模式迈进。 数据显示,截至2025年底: 大幅降低了算力成本和使用门槛。 这一模 ...
摩尔线程 突发大消息!
Zhong Guo Ji Jin Bao· 2025-12-20 13:32
Core Insights - Moore Threads unveiled its next-generation GPU architecture "Huagang" at the first MUSA Developer Conference, showcasing a full-stack technology system centered around its self-developed MUSA unified architecture [2][3] Group 1: New GPU Architecture - The "Huagang" architecture achieves significant breakthroughs in computing density, energy efficiency, precision support, interconnect capabilities, and graphics technology [3] - Key features include a 50% increase in computing density, substantial energy efficiency optimization, and support for full precision calculations from FP4 to FP64, along with new MTFP6/MTFP4 and mixed low precision support [3] - It integrates a new asynchronous programming model and self-developed MTLink high-speed interconnect technology, supporting the expansion of intelligent computing clusters with over 100,000 cards [3] Group 2: Future Chip Releases - Moore Threads announced two upcoming chips based on the "Huagang" architecture: "Huashan" focuses on AI training and inference integration for large-scale intelligent computing, serving as a robust foundation for the next-generation "AI factory" [4] - The "Lushan" chip specializes in high-performance graphics rendering, boasting a 64-fold increase in AI computing performance, a 16-fold increase in geometric processing performance, and a 50-fold increase in ray tracing performance [4] Group 3: Launch of Intelligent Computing Cluster - The company officially launched the "Kua'a" intelligent computing cluster, which offers full precision and general computing capabilities, achieving efficient and stable AI training and inference at a scale of 10,000 cards [5] - Core breakthroughs include a floating-point computing capability of 10 Exa-Flops, with training utilization rates of 60% on Dense models and 40% on MOE models, and a linear training expansion efficiency of 95% [5] Group 4: Competitive Landscape - Moore Threads did not showcase the products at the event, while Inspur unveiled the "Shuguang scaleX" ultra-cluster system, marking the first public appearance of a domestic 10,000-card computing cluster [6] - The industry is witnessing significant innovations in super-node architecture, high-speed interconnect networks, and storage performance optimization, with some technologies surpassing NVIDIA's 2027 roadmap milestones [6]
摩尔线程,突发大消息!
中国基金报· 2025-12-20 08:54
Core Viewpoint - Moore Threads has unveiled its new GPU architecture "Huagang" at the first MUSA Developer Conference, showcasing a comprehensive stack of technological achievements centered around its self-developed MUSA unified architecture [2][4]. Group 1: New GPU Architecture "Huagang" - The "Huagang" architecture features significant improvements in computing performance, with a 50% increase in computing density and optimized energy efficiency, supporting full precision calculations from FP4 to FP64 [4]. - It integrates a new asynchronous programming model and supports large-scale interconnection, enabling the expansion of computing clusters with over 100,000 cards through the self-developed MTLink high-speed interconnect technology [4]. - The architecture also includes an AI generative rendering framework and enhanced hardware ray tracing acceleration, fully supporting DirectX 12 Ultimate, facilitating a high degree of synergy between graphics rendering and intelligent computing [4]. Group 2: Future Chip Releases - Based on the "Huagang" architecture, Moore Threads announced two upcoming chips: "Huashan," which focuses on AI training and inference integration for large-scale intelligent computing, and "Lushan," which specializes in high-performance graphics rendering [5]. - The "Lushan" chip is expected to enhance AI computing performance by 64 times, geometric processing performance by 16 times, and ray tracing performance by 50 times, while significantly improving texture filling, atomic memory access, and video memory capacity [5]. Group 3: Launch of Kuaguo Computing Cluster - Moore Threads officially launched the Kuaguo computing cluster, which boasts full precision and general computing capabilities, achieving efficient and stable AI training and inference at a scale of ten thousand cards [7]. - The cluster's core breakthroughs include a floating-point computing capability of 10 Exa-Flops, with training utilization rates of 60% for Dense models and 40% for MOE models, and a linear scaling efficiency of 95% [7].
摩尔线程,突发大消息!
Zhong Guo Ji Jin Bao· 2025-12-20 08:50
Core Insights - Moore Threads unveiled its next-generation GPU architecture "Huagang" at the MUSA Developer Conference, showcasing a full-stack technology system centered around its self-developed MUSA unified architecture [1][2]. Group 1: New GPU Architecture - The "Huagang" architecture features significant improvements in computing performance, with a 50% increase in computing density and enhanced energy efficiency, supporting full precision from FP4 to FP64 [2]. - It integrates a new asynchronous programming model and MTLink high-speed interconnect technology, enabling scalability for over 100,000 card intelligent computing clusters [2]. - The architecture includes an AI generative rendering framework and supports DirectX 12 Ultimate, facilitating a high degree of synergy between graphics rendering and intelligent computing [2]. Group 2: Upcoming Chip Technologies - Moore Threads announced two upcoming chips based on the "Huagang" architecture: "Huashan," which focuses on AI training and inference for large-scale intelligent computing, and "Lushan," which specializes in high-performance graphics rendering [3]. - The "Lushan" chip is expected to enhance AI computing performance by 64 times, geometric processing performance by 16 times, and ray tracing performance by 50 times, along with improvements in texture filling and memory capacity [3]. Group 3: Intelligent Computing Cluster - The company launched the "Kua'e" intelligent computing cluster, capable of full precision and general-purpose computing, achieving a floating-point operation capability of 10 Exa-Flops [4]. - The training efficiency metrics include a 60% utilization rate for Dense large models and a 40% rate for MOE large models, with effective training time exceeding 90% and linear scaling efficiency reaching 95% [4]. Group 4: Competitive Landscape - Moore Threads did not showcase the products at the event, while another company, Inspur, presented its "scaleX" super cluster system, marking the first public appearance of a domestic ten-thousand-level computing cluster [5]. - The competitive landscape indicates that Moore Threads is proactively positioning itself for future computing scenarios, including the launch of the MT Lambda intelligent simulation training platform [5].
算力内卷时代,“开放架构”万卡超集群为何成刚需?
Xi Niu Cai Jing· 2025-12-20 04:47
Core Insights - The development of AI large models requires significant resources, including a large number of technical experts and substantial financial investment, with a critical need for powerful computing capabilities [1] - The demand for computing power is expected to grow exponentially across various industries, with IDC predicting that China's intelligent computing power demand will reach 2781 EFLOPS by 2028, reflecting an annual growth rate of 46.2% [1] - Traditional computing clusters face bottlenecks when scaling beyond thousands of cards, necessitating innovative solutions like the "ten-thousand card super cluster" [2] Group 1: ScaleX Ten-Thousand Card Super Cluster - The ScaleX ten-thousand card super cluster system was unveiled by Sugon at the HAIC2025 conference, designed to meet the extreme demands of AI infrastructure [3] - This system features 16 super nodes connected by a proprietary high-speed network, capable of supporting 10,240 AI accelerator cards, marking a significant advancement in domestic large-scale computing cluster technology [5] - The ScaleX system achieves a total computing power exceeding 5 EFLOPS, with a power usage effectiveness (PUE) value as low as 1.04, enhancing computing density by 20 times [5][9] Group 2: Technical Advantages - The ScaleX system utilizes a self-developed RDMA high-speed network, achieving 400 Gb/s bandwidth and under 1 microsecond communication latency, significantly improving communication performance [9] - The system incorporates deep optimization for storage, computing, and transmission, enhancing resource utilization by 55% during large model training [9] - It features a digital twin for intelligent scheduling and management, ensuring 99.99% availability and supporting the management of tens of thousands of nodes [9] Group 3: Open Architecture and Ecosystem Development - The ScaleX super cluster supports multiple brands of accelerator cards and mainstream computing ecosystems, promoting an open architecture for AI computing [10] - This initiative aims to lower the barriers for AI companies to develop intelligent computing clusters and foster a collaborative industrial ecosystem [10][12] - The open model allows users greater choice and compatibility with mainstream AI development frameworks, facilitating broader participation in the ecosystem [12][13]