Workflow
MT Lambda具身智能仿真训练平台
icon
Search documents
MDC2025:全功能GPU路线清晰,MUSA生态进入规模化验证阶段
海通国际· 2025-12-23 05:14
Investment Rating - The report does not explicitly state an investment rating for the industry or specific companies involved. Core Insights - The MUSA 5.0 has established a comprehensive full-stack system that includes instruction sets, programming models, compilers, and communication libraries, achieving engineering performance close to international mainstream standards across key metrics [2][10] - The Huagang architecture, introduced at the MDC 2025, represents a significant upgrade in compute density, energy efficiency, precision coverage, and interconnect capabilities, supporting full-precision computing from FP4 to FP64 and introducing mixed low precision [2][10] - Moore Threads is one of the few domestic GPU vendors committed to a "full-function GPU" strategy rather than focusing solely on AI accelerators, indicating a long-term vision for broader ecosystem development [2][10] Summary by Sections Event Overview - The inaugural MUSA Developer Conference (MDC 2025) was held on December 20-21, 2025, in Beijing, focusing on sovereign computing and the developer ecosystem, unveiling the next-generation full-function GPU architecture Huagang and the Ku'e ten-thousand-card AI compute cluster [1][9] Technical Developments - The Huagang architecture emphasizes asynchronous programming and ultra-large-scale interconnect (MTLink), laying the groundwork for scaling to ten-thousand-card and hundred-thousand-card clusters [2][10] - The Ku'e ten-thousand-card AI compute cluster achieved approximately 60% MFU on dense models and 40% on MoE models, with a linear scaling efficiency of about 95% and effective training time exceeding 90% [3][11] Ecosystem and Strategy - The report outlines a clear roadmap for progressively open-sourcing core components, including compute libraries and communication libraries, to enhance the ecosystem [2][14] - The MT Lambda platform was launched, integrating physics engines, graphics rendering engines, and AI compute engines to create a full-stack framework for development, simulation, and training [3][12] Future Directions - The company has articulated a clear product segmentation path with a focus on unified AI training and inference, positioning itself as a foundation for next-generation AI factories [2][14] - The Huashan and Lushan architectures are designed to cater to AI training and high-performance graphics rendering, respectively, with significant improvements in various performance metrics [3][14]
摩尔线程 突发大消息!
Zhong Guo Ji Jin Bao· 2025-12-20 13:32
Core Insights - Moore Threads unveiled its next-generation GPU architecture "Huagang" at the first MUSA Developer Conference, showcasing a full-stack technology system centered around its self-developed MUSA unified architecture [2][3] Group 1: New GPU Architecture - The "Huagang" architecture achieves significant breakthroughs in computing density, energy efficiency, precision support, interconnect capabilities, and graphics technology [3] - Key features include a 50% increase in computing density, substantial energy efficiency optimization, and support for full precision calculations from FP4 to FP64, along with new MTFP6/MTFP4 and mixed low precision support [3] - It integrates a new asynchronous programming model and self-developed MTLink high-speed interconnect technology, supporting the expansion of intelligent computing clusters with over 100,000 cards [3] Group 2: Future Chip Releases - Moore Threads announced two upcoming chips based on the "Huagang" architecture: "Huashan" focuses on AI training and inference integration for large-scale intelligent computing, serving as a robust foundation for the next-generation "AI factory" [4] - The "Lushan" chip specializes in high-performance graphics rendering, boasting a 64-fold increase in AI computing performance, a 16-fold increase in geometric processing performance, and a 50-fold increase in ray tracing performance [4] Group 3: Launch of Intelligent Computing Cluster - The company officially launched the "Kua'a" intelligent computing cluster, which offers full precision and general computing capabilities, achieving efficient and stable AI training and inference at a scale of 10,000 cards [5] - Core breakthroughs include a floating-point computing capability of 10 Exa-Flops, with training utilization rates of 60% on Dense models and 40% on MOE models, and a linear training expansion efficiency of 95% [5] Group 4: Competitive Landscape - Moore Threads did not showcase the products at the event, while Inspur unveiled the "Shuguang scaleX" ultra-cluster system, marking the first public appearance of a domestic 10,000-card computing cluster [6] - The industry is witnessing significant innovations in super-node architecture, high-speed interconnect networks, and storage performance optimization, with some technologies surpassing NVIDIA's 2027 roadmap milestones [6]
周末重磅!摩尔线程 首次公开
Core Insights - The first MUSA Developer Conference (MDC 2025) was held by Moore Threads in Beijing, where the company unveiled its new GPU architecture "Huagang" and a series of technological advancements [2] - Moore Threads has established a complete technology stack based on its self-developed unified architecture, covering "chip-edge-end-cloud" integration, and plans to increase R&D investment [2] Group 1: New Architecture and Chip Roadmap - The MUSA (Meta-computing Unified System Architecture) has been upgraded to version 5.0, achieving key breakthroughs in full-stack unification, performance, and ecological openness [3] - The "Huagang" architecture supports full precision calculations from FP4 to FP64, with a 50% increase in computing density and a 10-fold improvement in energy efficiency, capable of supporting over 100,000 card-scale intelligent computing clusters [3] - Two upcoming chip technologies based on the "Huagang" architecture were announced: "Huashan," focusing on AI training and ultra-large-scale intelligent computing, and "Lushan," specializing in high-performance graphics rendering [3][5] Group 2: AI Training and Computing Clusters - The newly launched "Kua'e" intelligent computing cluster achieves full precision and general computing capabilities, with a floating-point computing capacity of 10 Exa-Flops and training efficiency rates of 60% for Dense models and 40% for MOE models [7] - The MTT S5000 single card has achieved breakthroughs in inference performance, with a throughput of over 4000 tokens/s for Prefill and 1000 tokens/s for Decode [7] - Future architecture planning for the MTT C256 super node aims to enhance training efficiency and inference capabilities for large-scale intelligent computing centers [7] Group 3: Graphics Computing and AI Technologies - Moore Threads' products support major graphics and computing APIs, including DirectX 12 and Vulkan 1.3, and have achieved compatibility with mainstream domestic CPUs and operating systems [8] - Key breakthroughs in rendering technology include hardware-level ray tracing acceleration and self-developed AI generative rendering technology, enabling realistic lighting effects on domestic GPUs [8] - The MT Lambda embodiment intelligence simulation training platform integrates physics, rendering, and AI engines for efficient development and training environments [8] Group 4: Ecosystem Development and Education - The concept of "ecosystem" was emphasized, with the company focusing on building a self-reliant domestic computing industry ecosystem through collaboration and innovation [11] - The company has established a developer growth system through the Moore Academy, gathering nearly 200,000 developers and learners, and engaging over 100,000 students in over 200 universities [11] - The company plans to open-source key simulation acceleration components to enhance research and development efficiency in the robotics industry [9]