Workflow
夸娥
icon
Search documents
破解大模型算力困局?国产GPU用“AI工厂”给出答案
半导体行业观察· 2025-07-28 01:32
Core Viewpoint - The rapid development of artificial intelligence (AI) has made AI chips a global discussion hotspot, with NVIDIA dominating the market due to its GPU advantage, leading to record-high performance and market capitalization. AMD's CEO predicts that the market for AI and large computing system accelerators will exceed $500 billion in a few years [1] Group 1: Full-Function GPU Development - The evolution of computing power is closely tied to the development of full-function GPUs, which have transitioned from single-task graphics cards to versatile processors that support various applications, including AI [2] - Full-function GPUs have four core engines: AI computing acceleration, modern 3D graphics rendering, physical simulation and scientific computing, and ultra-high-definition video encoding and decoding [3] Group 2: Moore Threads' Innovations - Moore Threads, established in 2020, has developed a complete computing acceleration system, launching four generations of GPU architectures and intelligent SoC products, covering AI intelligence, professional graphics acceleration, and desktop graphics acceleration [5] - The company aims to build an "AI factory" to enhance the efficiency of advanced model production, addressing the bottlenecks in large model training for the AGI era [6] Group 3: AI Factory Efficiency - The efficiency of the "AI factory" is determined by five core elements: generality of accelerated computing, effective chip computing power, single-node efficiency, cluster efficiency, and cluster stability [7] - Moore Threads emphasizes the importance of full-function GPUs and full precision in achieving high efficiency in AI model training [9] Group 4: Technical Breakthroughs - The self-developed MUSA architecture allows for significant improvements in resource utilization and reduces the development cost of new chips, achieving a 30% performance increase in Transformer computing [11] - Innovations in memory systems and communication have led to a 50% bandwidth saving and a 60% reduction in latency, enhancing the effective computing power of single chips [12] Group 5: Cluster Solutions - The "KUA" cluster, based on full-function GPUs, aims to provide a comprehensive system-level solution for large-scale GPU computing, supporting over 1,000 computing nodes with ultra-low communication latency [17] - The KUA cluster incorporates advanced technologies to enhance training efficiency and stability, achieving over 99% effective training time [19] Group 6: Industry Applications - Moore Threads' full-function GPUs are driving innovations across various sectors, including physical simulation, AIGC, scientific computing, and intelligent manufacturing, with a vision to empower developers and serve multiple industries [21][25]