摩尔线程亮出全栈技术底牌:“花港”新架构与万卡集群冲击高端GPU市场格局

Core Insights - The article highlights the significant advancements made by Moore Threads in the GPU sector, particularly through the introduction of the new "Huagang" architecture and the "Kua'e" ten-thousand card intelligent computing cluster, which supports trillion-parameter model training [2][3]. Architecture Innovations - The "Huagang" architecture showcases a 50% increase in computing density and up to 10 times improvement in efficiency, fully supporting precision calculations from FP4 to FP64. It integrates the self-developed MTLink high-speed interconnect technology, facilitating cluster expansion beyond 100,000 cards [3][5]. - Two chips have been planned based on the "Huagang" architecture: "Huashan" for AI training and inference integration, and "Lushan" aimed at high-performance graphics rendering, with performance improvements of 64 times for AI computation, 16 times for geometric processing, and 50 times for ray tracing [5]. Cluster Capabilities - The "Kua'e" ten-thousand card intelligent computing cluster has publicly disclosed key engineering efficiency metrics, achieving a model compute utilization (MFU) of 60% for dense models and 40% for mixture of experts (MOE) models, with a linear scaling efficiency of 95% and effective training time exceeding 90% [6]. Ecosystem Development - Moore Threads announced the iteration of its unified software architecture MUSA to version 5.0, with plans to gradually open-source core components, including computation acceleration libraries and system management frameworks [8]. - The "Moore Academy" platform has attracted nearly 200,000 learners and collaborates with over 200 universities nationwide, reflecting a comprehensive approach to ecosystem building through technology open-sourcing, developer tool provision, and early talent cultivation [9]. Technological Integration and Exploration - The release indicates a trend towards the deep integration of graphics, AI, and high-performance computing, with hardware-level ray tracing acceleration and the introduction of the AI generative rendering technology MTAGR 1.0 [10]. - The company is also exploring cutting-edge fields such as embodied intelligence and AI for science, showcasing its ambition to redefine the value of GPUs as a general computing platform [10]. Industry Context - The comprehensive technology showcase reflects the current stage of domestic high-end computing power development, transitioning from single-chip innovations to tackling large-scale system engineering and building a thriving application ecosystem [11]. - The efficiency disclosure of the ten-thousand card cluster signifies that domestic computing infrastructure is beginning to undergo rigorous testing in large-scale, high-load scenarios, while the architecture iteration and integration of graphics and AI demonstrate the company's intent to define the next generation of computing architecture [11].

Moore Threads Technology-摩尔线程亮出全栈技术底牌:“花港”新架构与万卡集群冲击高端GPU市场格局 - Reportify