MindSpore

Search documents
华为版CUDA,全面开源了
猿大侠· 2025-08-07 04:11
Core Viewpoint - Huawei has announced the open-source of its Ascend AI GPU software toolkit, CANN, aiming to enhance its competitiveness against NVIDIA's CUDA ecosystem [1][3][7]. Group 1: Huawei's AI Strategy - Huawei's rotating chairman, Xu Zhijun, emphasized that the core of Huawei's AI strategy is computing power, with a focus on monetizing Ascend hardware [3]. - The open-sourcing of CANN and the Mind series application enablement suite is intended to accelerate innovation among developers, making Ascend easier to use [3][12]. Group 2: CANN Overview - CANN, a neural network computing architecture, provides multiple programming interfaces to help users build AI applications for Huawei's Ascend [4]. - CANN is described as Huawei's version of CUDA, offering a similar interface for GPU support [5]. - The latest version of CANN has been upgraded to 8.0, with both community and commercial versions available, supporting 12 operating systems [7]. Group 3: Competitive Landscape - The announcement coincided with the emergence of a new startup, Oxmiq Labs, founded by a legendary GPU architect, which aims to create a software ecosystem similar to CUDA [6][13]. - Oxmiq Labs focuses on developing GPU hardware and software IP, providing a vertically integrated platform for AI and graphics workloads [20][22]. - The software stack from Oxmiq is designed to be hardware-agnostic, allowing CUDA-based applications to run on non-NVIDIA hardware without modification [29][31]. Group 4: Developer Support - CANN currently supports various deep learning frameworks, including PyTorch, MindSpore, TensorFlow, and others, enhancing its utility for developers [15]. - The open-source initiative is expected to benefit developers by providing more options and reducing dependency on NVIDIA's ecosystem [32].
华为版CUDA,全面开源了
3 6 Ke· 2025-08-06 08:29
Core Insights - Huawei has announced the open-sourcing of its CANN software toolkit for Ascend AI GPUs, emphasizing its AI strategy centered around computing power and monetization of Ascend hardware [3][6][11] - The CANN architecture, akin to NVIDIA's CUDA, provides a multi-layer programming interface to help users build AI applications specifically for Huawei's Ascend GPUs [4][6] Group 1: Huawei's CANN Open-Sourcing - The CANN toolkit has been upgraded to version 8.0, offering both a community version for early feature access and a commercial version tailored for enterprise users [6] - CANN now supports various deep learning frameworks, including PyTorch, MindSpore, TensorFlow, and others, enhancing its compatibility and usability for developers [9] - Huawei has initiated the "CANN Open Source Ecosystem Co-Building Initiative," indicating a strong commitment to developing an open ecosystem around Ascend technology [11] Group 2: Competitive Landscape - A new startup, Oxmiq Labs, founded by a legendary GPU architect, aims to create a software ecosystem similar to CUDA, focusing on GPU hardware and software IP [12][14] - Oxmiq's software stack includes OXCapsule for workload management and OXPython, which allows CUDA-based applications to run on non-NVIDIA hardware without modification [21][23] - The competitive environment is intensifying, with multiple players challenging NVIDIA's dominance in the GPU market, ultimately benefiting developers through increased options and innovation [7][23]
对标英伟达CUDA,华为宣布开源CANN
Xin Lang Cai Jing· 2025-08-05 14:29
Core Insights - Huawei announced the full open-source of its Ascend hardware enabling CANN and Mind series application toolkits, aiming to accelerate innovation among developers and enhance usability [1] - The core of Huawei's AI strategy is centered around computing power, with a focus on monetizing Ascend hardware [1] - CANN serves as a bridge between AI training frameworks and Ascend chips, similar to NVIDIA's CUDA, which is a critical component of NVIDIA's competitive advantage [1][3] Group 1: CANN and Its Ecosystem - CANN has been upgraded to version 8.0, introducing over 200 optimized basic operators, 80 fusion operators, and 100 Ascend C APIs, significantly reducing the typical operator development cycle from 2 person-months to 1.5 person-weeks [4] - CANN is gradually becoming compatible with more AI frameworks, currently supporting PyTorch, MindSpore, TensorFlow, PaddlePaddle, ONNX, and others, facilitating easier migration of models and applications [5] - Huawei is committed to a layered deep openness for CANN, providing more SDKs for application development to enhance deployment convenience and efficiency in model training and inference [5] Group 2: Competitive Landscape - Despite the advancements, CANN's usability and ecosystem richness still lag behind NVIDIA's 18-year-developed CUDA ecosystem, indicating a long road ahead for Huawei [7] - Huawei has adopted strategies similar to NVIDIA's initial promotion of CUDA by sending engineering teams to assist major clients in adapting to the CANN environment [7] - The open-sourcing of CANN is a strategic move by Huawei to rapidly expand its ecosystem, with industry leaders and institutions collaborating to build an open-source Ascend ecosystem [7] Group 3: Market Position - Huawei's self-developed AI framework MindSpore has achieved a 30.26% market share in China's AI framework sector, ranking first in 2024, reflecting the company's commitment to open-source initiatives [8] - The company has previously open-sourced other foundational software, countering claims of a closed and monopolistic approach to technology development [8]
H20解禁,中美AI闭环竞赛开启
Hu Xiu· 2025-07-16 01:51
Group 1 - The H20 chip, previously banned by the US government, is crucial for AI model training in China and is now set to return to the market, indicating a shift in US-China tech relations [3][5][14] - Nvidia's revenue from the H20 chip in 2024 is projected to be between $12 billion and $15 billion, accounting for approximately 85% of its revenue from China [7] - After the ban, Nvidia suffered a loss of about $2.5 billion in sales in the first quarter, with an estimated total loss of $13.5 billion over two quarters [9][10] Group 2 - The return of the H20 chip signifies a tactical compromise in US-China relations, with both sides adjusting their strategies rather than fully decoupling [16][17][25] - Chinese companies have accelerated their development of domestic chips, with firms like Huawei and Alibaba investing in their own technologies to reduce reliance on foreign products [11][22][34] - The Chinese AI market has not stalled due to the H20 ban; instead, it has prompted faster domestic alternatives, potentially threatening Nvidia's market dominance in the future [14][19][51] Group 3 - The H20 chip's return is expected to restore supply chains and reduce costs for companies reliant on Nvidia, allowing AI projects to progress more rapidly [29][30] - The Chinese government is encouraging the use of domestic chips in new data centers, further supporting local technology development [34] - Despite the H20's return, some companies may still prefer Nvidia products due to their established reputation and compatibility, indicating a potential divide in corporate strategies [36][37] Group 4 - Nvidia is likely to focus on enhancing partnerships with leading Chinese AI companies and adapting its offerings to meet local regulatory requirements [43][46] - The competition between US and Chinese tech ecosystems is evolving, with both sides potentially developing parallel AI worlds [52][55] - The establishment of a self-sufficient Chinese AI ecosystem could lead to a significant shift in global tech dynamics, reducing dependence on Western technologies [60][61]
中美AI竞争报告:中国人工智能产业政策能否突破美国封锁?
3 6 Ke· 2025-07-01 07:53
Group 1 - The core objective of China's AI policy is to establish a $100 billion AI industry by 2030, generating over $1 trillion in added value across various sectors [2] - China's AI policies focus on enhancing economic development and national strength, contrasting with the more abstract "general AI race" narrative in the U.S. [2] - The Chinese government is deploying a comprehensive set of policy tools, including an $8.2 billion fund for AI startups and the establishment of national AI laboratories and experimental zones [3] Group 2 - Geopolitical tensions, particularly with the U.S., have led to a shift in China's AI policy towards self-reliance and strategic competition, emphasizing the need for an independent AI ecosystem [6] - Export controls from the U.S. have restricted China's access to advanced computing chips, which are crucial for AI development, prompting Chinese companies to seek alternative strategies [7] - Despite these challenges, the Chinese AI industry is likely to continue progressing, potentially fostering the development of its own semiconductor and software solutions [8] Group 3 - The effectiveness of China's AI policies remains uncertain, but government support is crucial in addressing key bottlenecks such as domestic chip development and talent shortages [9] - The rapid growth of data center energy demands is anticipated, with projections indicating a threefold increase by 2030, which China is likely to meet due to its faster pace of new power plant construction compared to the U.S. [9] - The private sector, particularly innovative tech companies, is expected to drive advancements in AI, with government policies needing to align with private sector needs to be deemed effective [11]
独家揭秘!华为如何让万台AI服务器秒变「超级大脑」
第一财经· 2025-06-09 09:01
Core Viewpoint - The article discusses the advancements in AI computing power clusters, highlighting how they enable the training and inference of large AI models through innovative technologies and fault tolerance mechanisms [1][24]. Group 1: Supernode High Availability - AI training and inference require continuous operation, with each computer in the cluster having a backup to ensure seamless task execution during failures [3][4]. - Huawei's CloudMatrix 384 supernode employs a fault tolerance strategy that includes system-level, business-level, and operational-level fault tolerance to maintain high efficiency [3][4]. Group 2: Cluster Linearity - The ideal scenario for computing power clusters is linear scalability, where 100 computers provide 100 times the power of one [6]. - Huawei's task distribution algorithms ensure that each computer operates efficiently, akin to an orchestra, preventing chaos during large-scale model training [6][8]. Group 3: Rapid Recovery for Large-Scale Training - The system can automatically record training progress, allowing for quick recovery from faults without starting over, significantly reducing downtime [10][11]. - Innovations such as process-level rescheduling and online recovery techniques have been developed to minimize recovery times to under 3 minutes [11][15]. Group 4: Fault Management and Diagnostic Capabilities - A real-time monitoring system continuously checks the health of each computer in the cluster, enabling quick identification and resolution of issues [17]. - Huawei's comprehensive fault management solution includes capabilities for error detection, isolation, and recovery, enhancing overall reliability [17][18]. Group 5: Simulation and Modeling - Before actual training, the computing cluster can simulate scenarios in a "digital wind tunnel" to identify potential bottlenecks and optimize performance [19][20]. - The Markov modeling simulation platform allows for multi-dimensional analysis and performance tuning, ensuring efficient resource allocation [19][20]. Group 6: Framework Migration - Huawei's MindSpore framework supports seamless migration from other frameworks, covering over 90% of PyTorch interfaces, enhancing developer accessibility [22]. - The framework also facilitates quick deployment of large models, improving inference performance through integration with mainstream ecosystems [22]. Group 7: Summary and Outlook - Huawei's innovations address various aspects of computing power clusters, including high availability, linearity, rapid recovery, fault tolerance, diagnostic capabilities, simulation, and framework migration [24]. - The future of computing infrastructure is expected to evolve through a collaborative cycle of application demand, hardware innovation, and engineering feedback, leading to specialized computing solutions [24].
华为昇腾万卡集群揭秘:如何驯服AI算力「巨兽」?
机器之心· 2025-06-09 04:33
Core Viewpoint - The article discusses the advancements in AI computing power clusters, highlighting their critical role in supporting large-scale AI models and ensuring high availability, fault tolerance, and efficient resource management [2][4][39]. Group 1: High Availability of Super Nodes - AI training and inference require continuous operation, similar to an emergency system in hospitals, where each computer in the cluster has a backup to take over in case of failure, ensuring uninterrupted tasks [6][5]. - Huawei's CloudMatrix 384 super node employs a fault tolerance scheme that includes system-level, business-level, and operational-level fault tolerance, transforming faults into manageable issues [7][8]. Group 2: Cluster Linearity - The ideal scenario for computing power clusters is linear scalability, where the total power of 100 computers should be 100 times that of one, achieved through precise task allocation algorithms [10]. - Huawei's team has developed key technologies to enhance training linearity for large models, achieving linearity rates of 96% for the Pangu Ultra 135B model with 4K cards [11][13]. Group 3: Rapid Recovery in Large-Scale Training - When training with thousands of computing units, the system can automatically save progress, allowing for quick recovery from faults without starting over, significantly reducing downtime [14][15]. - Innovations such as process-level rescheduling and online recovery techniques have been introduced to minimize recovery times to under 3 minutes and even 30 seconds for specific faults [16][20]. Group 4: Fault Management and Diagnosis - A real-time monitoring system continuously checks the health of each computer in the cluster, enabling quick identification and resolution of issues before they escalate [24][26]. - Huawei has developed a comprehensive fault management framework that includes capabilities for error detection, isolation, and recovery, enhancing the reliability of the computing infrastructure [24][28]. Group 5: Simulation and Modeling - Before deploying complex AI models, the computing cluster can simulate scenarios in a virtual environment to identify potential bottlenecks and optimize resource allocation [29][30]. - The introduction of a Markov modeling simulation platform allows for multi-dimensional analysis and performance prediction, improving resource efficiency and system stability [30][31]. Group 6: Framework Migration - Huawei's MindSpore framework has rapidly evolved since its open-source launch, providing tools for seamless migration from other frameworks and enhancing performance during training and inference [37][38]. - The framework supports a wide range of applications, enabling quick deployment of large models and improving inference capabilities [38][39].
穿越智算时代的供需鸿沟,华为的解题与破题
Sou Hu Cai Jing· 2025-05-31 20:41
Core Insights - The emergence of DeepSeek has significantly elevated the intelligent computing industry, demonstrating the "Jevons Paradox" where technological advancements lead to increased demand despite reduced resource consumption [1] - The cost of model training has decreased by 85% over the past three years, while the elasticity of computing power demand has expanded sixfold, making AI technology more accessible to all enterprises [1] - By the end of 2024, China's intelligent computing AI computing power supply is expected to reach 1450 EFlops, with a projected growth rate of over 40% annually for the next three years [1] Group 1: Challenges in Intelligent Computing - The first major challenge arises from the exponential growth in computing power demand driven by large models, which requires over 200 times the hardware supply [5] - The second challenge is the rapid penetration of AI across various industries, leading to difficulties in integrating AI technology with specific scenarios, as many emerging applications lack best practices [6] - The third challenge pertains to the ecosystem, where developers face fragmentation of tools, high learning costs, and uneven resource access, complicating collaboration between traditional enterprises and AI technology suppliers [7] Group 2: Strategic Innovations and Solutions - Huawei emphasizes the need for a comprehensive transformation involving innovations in computing architecture, resource scheduling, business models, and computing infrastructure to address the supply-demand contradiction [3] - Huawei is committed to a long-term strategy that supports the intelligent computing industry through foundational infrastructure, AI ecosystem development, and product empowerment [15][16] - The company has established a complete solution for the MoE architecture, enhancing resource utilization by 20% through dynamic balancing of multiple experts [11] Group 3: Ecosystem Development and Collaboration - Huawei's strategy includes hardware openness, software openness, enabling partners, and talent development, aiming to create a collaborative AI industry ecosystem [12][14] - The company has partnered with over 2,500 industry collaborators and developed more than 5,800 certified solutions, demonstrating its commitment to ecosystem building [14] - Huawei's focus on synergizing computing and networking technologies positions it uniquely to address the challenges of intelligent computing and enhance overall performance [20]
黄仁勋担心中国市场觉醒
3 6 Ke· 2025-05-08 03:02
Core Insights - The Milken Institute Global Conference focuses on addressing urgent global challenges, with this year's theme being "Driving a Prosperous World," emphasizing artificial intelligence and renewable resources [1][2]. Group 1: AI Industrial Revolution - The concept of the "AI Industrial Revolution" is introduced, indicating a complete restructuring of production systems and redefining human value [3]. - AI is seen as a digital workforce and a mass-manufacturable industrial product, reshaping enterprise operations and introducing a "dual factory" model [4][10]. Group 2: Dual Factory Model - Traditional factories produce tangible goods, while AI factories rely on GPU clusters, data centers, and computational resources to produce "intelligent units" or Tokens [5][7][9]. - Tokens serve as the digital fuel for future products, enabling various applications such as autonomous driving and customized financial analysis [7][16]. Group 3: Investment in AI Factories - Building an AI factory requires significant investment, with Nvidia's AI factory needing 1 gigawatt of power and costing approximately $60 billion [11][12]. - The investment is primarily in hardware, including GPUs, data centers, and energy infrastructure, indicating a need for substantial resources and planning [13][14]. Group 4: Global Economic Impact - The establishment of AI factories is expected to reshape the global economic landscape, with predictions of over $2 trillion in investments over the next decade [14][19]. - Countries that develop AI factories will gain control over smart pricing and standard-setting, influencing global industry upgrades [19][20]. Group 5: Market Dynamics and Competition - The potential loss of the Chinese market could lead to a significant loss of technological leadership for American companies, allowing Chinese firms to establish their own standards and frameworks [21][22]. - The emergence of a bifurcated global AI ecosystem could occur, with distinct "American" and "Chinese" technology spheres [22][23]. Group 6: Future of Global Supply Chains - Adoption of Chinese AI standards could lead to a reconfiguration of global supply chains, with companies needing to comply with these standards to access the Chinese market [26][29]. - This shift may create dependencies on Chinese technology, impacting manufacturing and data management practices worldwide [30][31]. Group 7: Economic Power Shift - The rise of a "Token economy" could challenge the dominance of the US dollar in international trade, as Tokens may influence transaction pricing [31][32]. - The potential for a new economic order based on AI capabilities and production capacity is highlighted, with countries competing for dominance in AI production [33][34].
华为郭振兴: DeepSeek浪潮后,AI将快速释放巨大的制造业生产红利 | 最前线
3 6 Ke· 2025-04-30 09:48
Group 1 - Huawei hosted the AI + Manufacturing Industry Summit 2025 in Guangzhou, focusing on accelerating industry intelligence with over 900 attendees from various manufacturing sectors [1] - Huawei introduced a "three-layer, five-step, eight-phase" methodology and shared 20 solutions across seven key scenarios in the manufacturing sector [1] - The company emphasized its full-stack AI infrastructure, which can adapt flexibly to multiple manufacturing scenarios, lowering the threshold for AI adoption [1] Group 2 - In the automotive sector, Huawei's collaboration with GAC Group has significantly reduced the vehicle development cycle from 36 months to 18 months using AI models and development toolchains [1] - Huawei's software development cycle has improved from 9-18 months to one month per release by integrating over 13 million high-value documents and 850+ open-source code repositories into its data platform [2] - By 2025, over 300 enterprises are expected to have plans for large model deployment, indicating a surge in demand for AI capabilities in manufacturing [2] Group 3 - Huawei has adapted its DeepSeek solution across various scenarios, including pre-training and reinforcement learning, to help clients complete secondary training quickly [3] - The company has optimized the performance of various models in the Ascend environment, with over 100 manufacturing partners already utilizing the DeepSeek solution [3] - Huawei aims to provide end-to-end full-stack infrastructure to support enterprises' digital transformation by focusing on data management and intelligent connectivity [3]