Workflow
全功能GPU
icon
Search documents
破解大模型算力困局?国产GPU用“AI工厂”给出答案
半导体行业观察· 2025-07-28 01:32
Core Viewpoint - The rapid development of artificial intelligence (AI) has made AI chips a global discussion hotspot, with NVIDIA dominating the market due to its GPU advantage, leading to record-high performance and market capitalization. AMD's CEO predicts that the market for AI and large computing system accelerators will exceed $500 billion in a few years [1] Group 1: Full-Function GPU Development - The evolution of computing power is closely tied to the development of full-function GPUs, which have transitioned from single-task graphics cards to versatile processors that support various applications, including AI [2] - Full-function GPUs have four core engines: AI computing acceleration, modern 3D graphics rendering, physical simulation and scientific computing, and ultra-high-definition video encoding and decoding [3] Group 2: Moore Threads' Innovations - Moore Threads, established in 2020, has developed a complete computing acceleration system, launching four generations of GPU architectures and intelligent SoC products, covering AI intelligence, professional graphics acceleration, and desktop graphics acceleration [5] - The company aims to build an "AI factory" to enhance the efficiency of advanced model production, addressing the bottlenecks in large model training for the AGI era [6] Group 3: AI Factory Efficiency - The efficiency of the "AI factory" is determined by five core elements: generality of accelerated computing, effective chip computing power, single-node efficiency, cluster efficiency, and cluster stability [7] - Moore Threads emphasizes the importance of full-function GPUs and full precision in achieving high efficiency in AI model training [9] Group 4: Technical Breakthroughs - The self-developed MUSA architecture allows for significant improvements in resource utilization and reduces the development cost of new chips, achieving a 30% performance increase in Transformer computing [11] - Innovations in memory systems and communication have led to a 50% bandwidth saving and a 60% reduction in latency, enhancing the effective computing power of single chips [12] Group 5: Cluster Solutions - The "KUA" cluster, based on full-function GPUs, aims to provide a comprehensive system-level solution for large-scale GPU computing, supporting over 1,000 computing nodes with ultra-low communication latency [17] - The KUA cluster incorporates advanced technologies to enhance training efficiency and stability, achieving over 99% effective training time [19] Group 6: Industry Applications - Moore Threads' full-function GPUs are driving innovations across various sectors, including physical simulation, AIGC, scientific computing, and intelligent manufacturing, with a vision to empower developers and serve multiple industries [21][25]
国产GPU跑满血DeepSeek,已经可以100 tokens/s了!
量子位· 2025-07-26 09:01
Core Viewpoint - The fastest chip for running full-scale DeepSeek is a domestic GPU from Moore Threads, achieving a speed of 100 tokens/s, significantly faster than foreign GPUs at 50 tokens/s and domestic counterparts at 15 tokens/s [1][4]. Group 1: Moore Threads' Achievements - Moore Threads has developed an AI super factory that goes beyond just creating faster chips, focusing on a comprehensive transformation of the entire technology stack [6][10]. - The AI super factory is not a physical chip manufacturing facility but a systemic overhaul that includes innovations in chip architecture, cluster design, and software algorithms [9][10]. Group 2: Key Components of the AI Super Factory - The AI super factory's production efficiency is defined by five core elements: generality of accelerated computing, effective chip performance, node efficiency, cluster efficiency, and cluster stability [13]. - A full-function GPU serves as the foundation of the AI super factory, evolving from basic graphics acceleration to a versatile computing platform capable of handling various AI tasks [14][16]. Group 3: MUSA Architecture - The MUSA architecture acts as the "chief designer" of the super factory, allowing for scalable and configurable chip designs that optimize resource allocation [25][26]. - MUSA's innovative design enables global resource sharing, reducing bottlenecks and improving efficiency during multi-task operations [27][29]. Group 4: Full-Stack Software System - Moore Threads has created a full-stack software system that integrates deeply with the MUSA hardware architecture, enhancing developer experience and operational efficiency [35][36]. - The software stack includes optimized drivers, core operator libraries, and tools for performance analysis, significantly improving task handling and resource utilization [41][42]. Group 5: KUAE Computing Cluster - The KUAE computing cluster is a soft-hard integrated system that extends the performance advantages of individual GPUs to large-scale deployments, enabling efficient training of massive AI models [43][44]. - The cluster supports various parallel training strategies and provides end-to-end training optimization, ensuring high performance and stability [45][46]. Group 6: Zero-Interrupt Fault Tolerance Technology - Moore Threads has developed a unique zero-interrupt fault tolerance technology that allows for continuous operation of the AI super factory, minimizing downtime and recovery costs [47][49]. - This technology enhances the overall stability and reliability of the system, ensuring high effective training time and reducing the impact of potential failures [51][52]. Group 7: Future of AI and Computing Needs - The demand for computing power is expected to grow exponentially, driven by advancements in generative AI and the need for complex task execution [54][56]. - Moore Threads aims to provide a comprehensive solution that addresses the challenges of AI model training, emphasizing the importance of stability, reliability, and efficiency in future computing [58][61].
国产GPU“全能选手”冲刺科创板 摩尔线程的技术长跑
Core Insights - The article highlights the emergence of domestic GPU companies, particularly Moores Threads, as they approach a new phase of technological and capital convergence, with the company preparing for an IPO on the Sci-Tech Innovation Board [1][10] - Moores Threads has established itself as a significant player in the semiconductor industry by focusing on self-developed "full-function GPUs," which are gradually reaching international standards in terms of computing power, versatility, and ecosystem compatibility [1][3] Industry Overview - The global GPU market is projected to reach 36,119.74 billion yuan by 2029, with China's GPU market expected to grow to 13,635.78 billion yuan, increasing its global market share from 30.8% in 2024 to 37.8% in 2029 [1] - The AI chip market in China is anticipated to surge from 1,425.37 billion yuan in 2024 to 13,367.92 billion yuan by 2029, with a compound annual growth rate of 53.7% [8] Company Development - Moores Threads has developed the MUSA architecture, which is the first domestic architecture to support AI computing, graphics acceleration, and physical simulation on a single chip, marking a significant technological breakthrough [4][6] - The company has launched several products, including the MTT S80 consumer graphics card and the MTT S5000 AI computing card, which have shown competitive performance against international counterparts [4][5] Financial Performance - Moores Threads' revenue has increased from 45.84 million yuan in 2022 to 432 million yuan in 2024, reflecting a compound annual growth rate of 208% [9] - The company's AI computing business accounts for 77.6% of its revenue, driven by the demand for large model training and GPU cloud services [9] Strategic Positioning - The company has a strong focus on R&D, with an investment of 1.359 billion yuan in 2024, resulting in a research expense ratio of 309.88% [6] - Moores Threads has secured 450 patents, including 442 domestic patents, which positions it well within the GPU intellectual property landscape [6] Market Opportunities - The company is well-positioned to capitalize on the growing demand for GPUs in various sectors, including AI, digital twins, autonomous driving, and virtual reality [8][11] - The ongoing IPO process is expected to provide Moores Threads with additional capital and resources to support its technological advancements and market expansion [10][11]