Sugon(603019)
Search documents
算力内卷时代,“开放架构”万卡超集群为何成刚需?
Xi Niu Cai Jing· 2025-12-20 04:47
Core Insights - The development of AI large models requires significant resources, including a large number of technical experts and substantial financial investment, with a critical need for powerful computing capabilities [1] - The demand for computing power is expected to grow exponentially across various industries, with IDC predicting that China's intelligent computing power demand will reach 2781 EFLOPS by 2028, reflecting an annual growth rate of 46.2% [1] - Traditional computing clusters face bottlenecks when scaling beyond thousands of cards, necessitating innovative solutions like the "ten-thousand card super cluster" [2] Group 1: ScaleX Ten-Thousand Card Super Cluster - The ScaleX ten-thousand card super cluster system was unveiled by Sugon at the HAIC2025 conference, designed to meet the extreme demands of AI infrastructure [3] - This system features 16 super nodes connected by a proprietary high-speed network, capable of supporting 10,240 AI accelerator cards, marking a significant advancement in domestic large-scale computing cluster technology [5] - The ScaleX system achieves a total computing power exceeding 5 EFLOPS, with a power usage effectiveness (PUE) value as low as 1.04, enhancing computing density by 20 times [5][9] Group 2: Technical Advantages - The ScaleX system utilizes a self-developed RDMA high-speed network, achieving 400 Gb/s bandwidth and under 1 microsecond communication latency, significantly improving communication performance [9] - The system incorporates deep optimization for storage, computing, and transmission, enhancing resource utilization by 55% during large model training [9] - It features a digital twin for intelligent scheduling and management, ensuring 99.99% availability and supporting the management of tens of thousands of nodes [9] Group 3: Open Architecture and Ecosystem Development - The ScaleX super cluster supports multiple brands of accelerator cards and mainstream computing ecosystems, promoting an open architecture for AI computing [10] - This initiative aims to lower the barriers for AI companies to develop intelligent computing clusters and foster a collaborative industrial ecosystem [10][12] - The open model allows users greater choice and compatibility with mainstream AI development frameworks, facilitating broader participation in the ecosystem [12][13]
研判2025!中国存储服务器行业政策、产业链全景、发展现状、企业布局及未来发展趋势分析:算力基建提速扩容,存储服务器赛道前景广阔[图]
Chan Ye Xin Xi Wang· 2025-12-20 03:31
Core Insights - The storage server industry is driven by multiple favorable factors, including supportive policies, technological breakthroughs in storage chips, and increasing demand across various sectors such as public services, internet, and finance [1][6][9] Industry Overview - Storage servers are specialized servers focused on data storage management, integrating hardware and software to provide high reliability and scalability for massive structured and unstructured data [2][6] - Compared to general servers, storage servers prioritize storage functionality, featuring more hard drive slots and supporting large-capacity storage media [3][4] Market Size and Growth - The overall server market in China is projected to reach 249.21 billion yuan in 2024, with the storage server market expected to reach 43.87 billion yuan, showing steady growth into 2025 [1][9] - The storage server market is anticipated to grow to 52.19 billion yuan by 2025, driven by AI demand and digital transformation [10] Policy Support - A series of significant policies have been introduced to support the storage server industry, focusing on technology standards, infrastructure development, and green transformation [6][9] Industry Chain - The storage server industry chain in China is tightly integrated, covering core components to end applications, with upstream focusing on key components and software supply [7][9] - Domestic companies are gradually breaking into high-end fields, supported by competitive pricing and customization services [7] Competitive Landscape - The industry features a competitive landscape characterized by leading companies dominating the market, ODM manufacturers providing customized solutions, and niche players focusing on specific segments [10][12] - Major players like Huawei and Inspur hold over 60% of the market share, while companies like Yihualu and Jiangbolong are carving out niches with differentiated technologies [10][12] Development Trends - The storage server industry is expected to evolve towards high performance and green technology, with distributed storage and NVMeoF protocols becoming mainstream [12][13] - The market will see a shift from single product competition to integrated solution offerings, with a focus on customized products for specific scenarios [14]
计算产业反内卷第一枪打响!
国芯网· 2025-12-19 14:12
Core Viewpoint - The article discusses the strategic decision by Zhongke Shuguang to exit terminal markets, including servers and personal computers, by 2026, aiming to focus on core technology and enhance the overall competitiveness of the ecosystem, thereby addressing the issue of excessive internal competition in the Chinese computing industry [2][3]. Group 1: Strategic Decisions - Zhongke Shuguang announced its exit from the server, personal computer, and industrial control markets, emphasizing a shift towards core technology development and product innovation [2][3]. - The decision is seen as a bold move that may result in significant revenue loss, but it is intended to alleviate the burdens of a highly competitive and inefficient market [2][3]. - The chairman, Li Jun, advocates for a collaborative approach among members of the Guanghe Organization to enhance the industry’s value rather than engaging in detrimental competition [2][3]. Group 2: Technological Advancements - Zhongke Shuguang showcased its "scaleX 万卡超集群" AI cluster system, which boasts a 20-fold increase in computing density and a PUE value of 1.04, capable of deploying 10,240 AI acceleration cards with a total computing power exceeding 5 EFlops [4]. - The system utilizes proprietary technologies, including a 400G InfiniBand network and advanced data transmission designs, which enhance performance and resource utilization [4][5]. - The company aims to transform cluster management through digital twin technology, achieving 99.99% availability for large-scale clusters and moving towards automated system maintenance [5]. Group 3: Industry Collaboration and Ecosystem - The Guanghe Organization has grown to over 6,000 partners and established numerous ecological adaptation centers, becoming a pivotal force in promoting the domestic computing industry [6]. - The organization emphasizes the need for rational division of labor and collaboration to mitigate low-quality competition, which has become a common challenge among its members [6]. - Major companies, including SenseTime and Huada Jiutian, have formed strategic partnerships to launch over 50 AI innovation results, indicating a strong collaborative spirit within the industry [7]. Group 4: Open Development and Future Vision - The concept of "openness" has shifted from an optional strategy to a consensus for industry development, with major players like Alibaba and ByteDance adopting open development routes [8]. - The article highlights that open technology routes are essential for ensuring industry security and national strategic safety, particularly in the context of China's intelligent computing infrastructure [8]. - The vision of reducing internal competition aligns with the need for orderly openness, as articulated by Li Jun, who believes that collective efforts will strengthen the AI industry in China [8].
超节点互连技术落地 国产万卡超集群首次真机亮相
2 1 Shi Ji Jing Ji Bao Dao· 2025-12-19 13:32
Core Insights - The article discusses the emergence of high-performance computing clusters, specifically the scaleX ultra-cluster developed by Sugon, which integrates 16 scaleX640 supernodes to achieve over 5 EFlops of computing power, marking a significant advancement in domestic AI computing infrastructure [4][5]. Group 1: Ultra-Cluster Development - The scaleX ultra-cluster is the world's first single-cabinet 640-card supernode, utilizing advanced technologies such as high-density blade servers and immersion cooling, resulting in a 20-fold increase in computing density and a PUE value as low as 1.04 [1][4]. - The scaleX ultra-cluster represents a shift from traditional scattered server deployments to a more integrated and efficient computing unit, showcasing the progress of domestic computing infrastructure from conceptual designs to tangible products [1][5]. Group 2: Demand for Computing Power - As mainstream AI models transition from hundreds of billions to trillions of parameters, the demand for computing power has surged, necessitating the development of EFLOPS-level and ten-thousand-card high-performance clusters as standard configurations for large models [2][3]. - The supernode architecture is becoming a preferred choice for new ten-thousand-card clusters due to its density and performance advantages, allowing for significant optimization in computing capabilities [3]. Group 3: Networking and Scalability - The scaleX ultra-cluster employs the scaleFabric high-speed network, which utilizes the first domestic 400G-class InfiniBand RDMA network cards, achieving 400 Gb/s bandwidth and under 1 microsecond communication latency, enhancing scalability to over 100,000 cards [7]. - The architecture allows for both Scale-up (vertical expansion) and Scale-out (horizontal expansion), addressing traditional communication bottlenecks and enabling the construction of large-scale intelligent computing clusters [6]. Group 4: Challenges and Considerations - The deployment of supernodes introduces systemic challenges, including heat dissipation from numerous chips, stability issues from mixed optical and copper interconnects, and reliability concerns from long-term operation of multiple components [8]. - As the scale of intelligent computing clusters expands, key challenges include ensuring scalability, reliability, and energy efficiency, necessitating breakthroughs in power supply technology and advanced software management for sustainable operation [8].
超节点互连技术落地,国产万卡超集群首次真机亮相
2 1 Shi Ji Jing Ji Bao Dao· 2025-12-19 13:24
Core Insights - The launch of the scaleX万卡超集群 marks the first physical appearance of a domestic万卡级 AI cluster system in China, showcasing significant advancements in AI computing capabilities [1][3] - The scaleX640 super node, part of the scaleX万卡超集群, integrates 16 super nodes and achieves a total computing power exceeding 5 EFLOPS, highlighting the growing demand for high-performance computing in AI applications [3][5] - The industry is transitioning from traditional server architectures to super node designs, which offer higher density and performance, becoming the preferred architecture for new万卡级 clusters [2][5] Company Developments - 中科曙光's scaleX640 super node is recognized as the world's first single cabinet-level 640-card super node, emphasizing the company's leadership in high-density computing solutions [2][3] - The scaleX万卡超集群 utilizes the scaleFabric high-speed network, which can achieve 400Gb/s bandwidth and less than 1 microsecond communication latency, significantly enhancing inter-node communication efficiency [7][8] - The company is addressing challenges related to system cooling, stability, and reliability as it scales up its super node architecture to meet the increasing demands of AI workloads [6][8] Industry Trends - The demand for computing power is rapidly increasing as AI models evolve from hundreds of billions to trillions of parameters, necessitating the development of万卡级 and beyond computing clusters [1][5] - Major international players like Meta, Microsoft, and OpenAI are also investing in the construction of 100,000-card clusters, indicating a global trend towards larger-scale AI computing infrastructures [6] - The industry is facing critical challenges in scalability, reliability, and energy efficiency as computing centers grow from megawatt to gigawatt levels, necessitating innovative power supply technologies and advanced management software [8]
中科曙光与商汤科技、大晓机器人合作签约
Bei Jing Shang Bao· 2025-12-19 12:21
Core Viewpoint - Zhongke Shuguang has announced a strategic partnership with SenseTime and Daxiao Robotics to enhance the development of domestic artificial intelligence infrastructure and embodied intelligence technologies [1] Group 1: Strategic Collaboration - The three companies will leverage their respective technological and industrial advantages to promote the construction of a "computing power infrastructure + world model + embodied intelligence" ecosystem [1] - This collaboration aims to accelerate the extension of AI capabilities into the physical world [1]
国产算力的开放时刻:超节点迈入万卡纪元
傅里叶的猫· 2025-12-19 10:11
Core Viewpoint - The launch of the scaleX 10,000-card AI supernode by Zhongke Shuguang marks a significant milestone in China's AI computing power history, entering the era of 10,000-card supernodes [1][3]. Group 1: Development of AI Computing Power - The establishment of the scaleX 10,000-card supernode represents a new answer to the development path of China's AI computing infrastructure [3]. - Three years ago, China's AI computing power system heavily relied on NVIDIA for GPU acceleration, NVLink technology, and CUDA software, creating a dependency on a single supplier [4]. - The turning point came with export restrictions on NVIDIA chips, prompting domestic manufacturers to explore alternative computing power systems [4]. Group 2: Competitive Landscape - Major players like Huawei, Inspur, and Alibaba are entering the AI supernode market, each adopting different technological routes [5]. - Huawei has taken a "fully self-developed" approach, while Inspur and Alibaba focus on "open architecture" to build a domestic AI computing foundation [6]. - The scaleX 10,000-card supernode consists of 16 scaleX640 supernodes, totaling 10,240 AI accelerator cards and exceeding 5 EFlops in computing power [7]. Group 3: Technological Innovations - The scaleX640 supernode features a self-developed scaleFabric high-speed network with a bandwidth of 400 Gb/s and an end-to-end latency of less than 1 microsecond [7]. - The system supports multiple brands of accelerator cards, indicating a shift towards a diversified computing power ecosystem in China [7]. Group 4: Industry Trends - The trend of "de-NVIDIA" is driven by the need for computing power security and independent innovation in China, especially following U.S. export restrictions on high-performance GPUs [8]. - The domestic AI industry is not merely replicating NVIDIA but aims to establish a complete, replaceable computing power ecosystem [8]. - The development paths of closed-stack integration, represented by Huawei, and open collaboration, represented by Shuguang, Inspur, and Alibaba, are emerging as two significant trends in the industry [8]. Group 5: Application and Impact - Various products have already been deployed, with Huawei's CM384 and Inspur's SD200 being used in operational data centers [9]. - The open architecture approach has facilitated the large-scale application of domestic chips, moving away from reliance on NVIDIA's ecosystem [9]. - The year 2025 is seen as a turning point for China's AI computing power system, emphasizing the importance of both performance and collaborative ecosystems [11].
AI下一程:从“单点突围”到“生态共进”
Huan Qiu Shi Bao· 2025-12-19 06:13
Core Insights - The core theme of HAIC2025 is the importance of "open computing" in driving AI industry collaboration and overcoming existing challenges in the Chinese AI sector [3]. Group 1: AI Development and Challenges - AI is becoming a core engine for new productivity, but traditional scaling methods are insufficient for sustaining rapid AI iteration [2]. - The Chinese AI industry faces two main challenges: breaking through the computing power bottleneck and achieving inclusive computing power [3]. - The current closed systems are expensive, and there is a mismatch in supply and demand for computing resources, which hinders the development of small and medium enterprises [3]. Group 2: Technological Innovations - The scaleX supercluster, designed for trillion-parameter models and complex tasks, was showcased at HAIC2025, featuring innovations in architecture and performance [4]. - The scaleX supercluster has achieved a 20-fold increase in computing density per cabinet and supports multiple brands of AI acceleration cards, significantly lowering overall ownership costs [4]. Group 3: Strategic Collaborations - Strategic partnerships were formed during HAIC2025, including collaborations between SenseTime, Daxiao Robotics, and Inspur to advance world models on domestic computing platforms [5]. - The new Kairos 3.0 world model supports complex scene modeling and interaction generation, demonstrating deep compatibility with the scaleX supercluster [5]. Group 4: Future Directions in AI - Future AI development is characterized by "two supers," "one openness," and "two integrations," emphasizing the need for high-density computing and open ecosystems [5]. - The AI supercluster represents a promising direction, overcoming traditional communication bottlenecks and enhancing computational efficiency [5]. Group 5: AI+ Applications - HAIC2025 highlighted numerous successful AI+ applications, including the launch of the world's first multimodal language model focused on geographic sciences [6]. - Examples of AI+ applications include the use of AI in the rapid iteration of domestic electric vehicles and smart road systems in Gansu province for traffic management [6].
人形占比更高的机器人ETF易方达(159530)高开涨近2%,大晓机器人连发三大技术成果,携手商汤、中科曙光共建具身智能生态
Xin Lang Cai Jing· 2025-12-19 03:11
Group 1 - The core viewpoint of the news highlights the significant growth in the robotics industry, particularly with the rise of the robot ETF E Fund (159530), which has seen a 1.64% increase and a total transaction volume of 1.66 billion yuan [1] - As of December 18, the E Fund robot ETF has experienced a scale increase of 294 million yuan this month, with a notable growth of 38 million shares in the past week, indicating strong investor interest [1] - The latest net inflow of funds into the E Fund robot ETF is 21.26 million yuan, with a total of 173 million yuan net inflow over the last five trading days, reflecting a positive trend in investment [1] Group 2 - The production of industrial robots is expected to exceed 700,000 units in 2025, with November's output reaching approximately 70,200 units, marking a year-on-year increase of 20.60% [2] - The growth in industrial robot production is driven by significant equipment upgrade policies and the ongoing digitalization and intelligent transformation of the manufacturing sector, leading to strong demand for equipment purchases [2] - The favorable conditions in the domestic market, along with the "14th Five-Year Plan" promoting effective investment and major engineering projects, suggest a continued recovery in the industry's prosperity [2] Group 3 - The E Fund robot ETF (159530) tracks the National Robot Industry Index and selects listed companies within the robotics sector, reflecting the market performance of the robotics industry [3] - The index has a high proportion of humanoid robots at 77%, surpassing the 64% of similar indices, indicating a potential benefit from future trends in the humanoid robot sector [3]
中科曙光展出万卡超集群,部分能力超英伟达NVL576
Guan Cha Zhe Wang· 2025-12-19 03:04
Core Viewpoint - The launch of the scaleX Wanka supercluster by Zhongke Shuguang at the HAIC2025 marks a significant advancement in large-scale intelligent computing systems, showcasing China's capabilities in AI infrastructure [1][3]. Group 1: Product Features - The scaleX Wanka supercluster is designed for trillion-parameter models and complex tasks, featuring innovations in super-node architecture, high-speed interconnect networks, storage performance optimization, and system management [3]. - It includes the world's first single-cabinet 640-card super-node, achieving a total computing power exceeding 5 EFlops and a PUE value as low as 1.04, enhancing computing density by 20 times [3][4]. - The proprietary RDMA high-speed network, scaleFabric, offers 400 Gb/s bandwidth and less than 1 microsecond communication latency, improving communication performance by 2.33 times compared to traditional IB networks while reducing overall network costs by 30% [4]. - The system features tightly coupled optimization of storage, computing, and transmission, enhancing data transfer efficiency and increasing AI accelerator resource utilization by 55% [4]. - A digital twin for the supercluster enables intelligent management and fault recovery, ensuring 99.99% availability and supporting management of tens of thousands of nodes and users [4]. Group 2: Market Applications and Collaborations - The scaleX Wanka supercluster supports over 400 mainstream large models and is applicable in diverse fields such as large model training, financial risk control, geological energy exploration, and scientific intelligence [6]. - Strategic collaborations were established at the conference among companies like SenseTime, Zhongke Shuguang, and others, focusing on optimizing AI computing hardware and software systems and innovating applications in embodied intelligence [6].