Workflow
盘古Pro MoE
icon
Search documents
6个90%!深圳企业创新主力军地位越发稳固
Sou Hu Cai Jing· 2025-08-07 20:51
Core Insights - The article highlights Shenzhen's strong emphasis on enterprise-driven innovation, showcasing the city's ability to produce significant technological advancements through local companies [2][5][10] Group 1: Innovation Achievements - Huawei showcased its Ascend 384 super node at the World Artificial Intelligence Conference, achieving the industry's largest-scale 384-card high-speed bus interconnection, enhancing resource scheduling efficiency [1] - Shenzhen's enterprises, such as the humanoid robot from Stardust Intelligent, demonstrate advanced capabilities, including high-speed and precision performance in various applications [2] - The city is home to over 2,600 AI companies, with notable models like Huawei's Pangu and Tencent's comprehensive technology architecture supporting a wide range of AI applications [3] Group 2: R&D Investment - Shenzhen's high-tech enterprises have reached 25,000, with an average density of 12 per square kilometer, the highest in the country [5][10] - In 2024, Huawei's R&D investment was 179.7 billion yuan, accounting for 20.8% of its revenue, while Tencent invested 70.7 billion yuan, focusing on AI and cloud computing [6] - Shenzhen leads the nation in PCT international patent applications for 20 consecutive years, with enterprises contributing 76% of the city's applications [6] Group 3: Industrial Support - Shenzhen boasts a complete industrial chain, enabling rapid assembly and production, such as assembling a 3D printer in 2 minutes and producing drones from concept to mass production in 3 months [8][10] - The city has transformed its traditional electronics market into an innovation incubator, providing comprehensive supply chain services for hardware entrepreneurs [8] - The collaborative environment in Shenzhen allows for efficient innovation, with market demands directly guiding R&D efforts [8]
华为盘古否认抄袭阿里后,其大模型员工自曝存在套壳、续训、洗水印
Qi Lu Wan Bao· 2025-07-07 03:50
Core Insights - Huawei announced the open-source release of its Pangu Pro MoE model, which includes a 7 billion parameter dense model and a 72 billion parameter mixture of experts model, as a key initiative for building the Ascend ecosystem [1] - A GitHub study revealed a high similarity of 0.927 in attention parameter distribution between Pangu Pro MoE and Alibaba's Qwen-2.5 14B model, significantly exceeding the typical industry variance of below 0.7 [1] - Huawei's Noah's Ark Lab clarified that the Pangu Pro MoE model was developed and trained on its Ascend hardware platform and not based on other vendors' models [4] - An employee from the Pangu team disclosed that there were instances of "shelling," retraining, and watermark removal, indicating unethical practices within the team [5] Summary by Sections Open Source Announcement - Huawei's open-source release of Pangu Pro MoE is seen as a significant step for the Ascend ecosystem [1] - The model includes a 7 billion parameter dense model and a 72 billion parameter mixture of experts model [1] Research Findings - A GitHub analysis indicated a high similarity of 0.927 between Pangu Pro MoE and Alibaba's Qwen-2.5 model, which is well above the industry norm [1] Company Statements - Huawei's Noah's Ark Lab stated that the Pangu Pro MoE model was developed on its own hardware and not through incremental training of other vendors' models [4] - The team acknowledged that some code implementations referenced industry open-source practices [4] Internal Disclosures - An employee revealed unethical practices, including using Qwen 1.5 110B for retraining and efforts to remove watermarks [5] - The employee cited pressure from leadership and internal doubts as factors leading to these practices [5] - The employee left the company due to ethical concerns and decided to expose these practices [5] Current Status - As of now, Huawei has not issued a statement regarding the employee's disclosures [6]
华为盘古大模型首次开源!昇腾单卡秒输出1148tokens,16B激活参数不输32B密集模型
量子位· 2025-07-02 09:33
Core Viewpoint - Huawei's Pangu Pro MoE model has been open-sourced, featuring 72 billion parameters and demonstrating competitive performance against 32 billion dense models in both Chinese and English understanding and reasoning capabilities [1][8]. Model Performance - The Pangu Pro MoE model has a total parameter count of 72 billion, with 16 billion activated parameters, representing 22.2% of the total [8]. - In various tests, Pangu Pro MoE performs comparably to 32 billion dense models, achieving notable scores in benchmarks such as MMLU and DROP [9][11][12]. - Specifically, it scored 82.6 in MMLU-PRO, surpassing other models, and achieved 91.1 in C-Eval for Chinese tasks, outperforming Qwen3-32B [10][12]. Inference Efficiency - The model exhibits high inference efficiency, achieving an average input throughput of 4828 tokens per second on a single card with W8A8 quantization, which is a 203% improvement over 72 billion and 42% over 32 billion dense models [17]. - During the decoder phase, it reached an output throughput of 1148 tokens per second, outperforming both 72 billion and 32 billion dense models [19]. Architecture Innovations - Pangu Pro MoE introduces a new MoE architecture optimized for Ascend chips, utilizing a Mixture of Grouped Experts (MoGE) approach to achieve load balancing across devices [22][24]. - The model's training and inference facilities have been specifically adapted for the Ascend cluster, enhancing communication efficiency and reducing overhead [30][32]. Quantization and Optimization - The model employs expert-aware post-training quantization and KV cache compression to optimize inference efficiency while maintaining model accuracy [37][38]. - Operator fusion techniques have been implemented to enhance memory bandwidth utilization, achieving significant acceleration in attention operations [39][41]. Technical Reports and Resources - Technical reports in both Chinese and English have been published, detailing the model's architecture and performance metrics [4][45].
MoE那么大,几段代码就能稳稳推理 | 开源
量子位· 2025-07-02 09:33
金磊 发自 凹非寺 量子位 | 公众号 QbitAI 混合专家网络模型架构(MoE) 已经成为当前大模型的一个主流架构选择,以最近开源的盘古Pro MoE为例,其基于MoGE架构构建的混 合专家架构,总参数量达720亿,激活参数量为160亿,专门针对昇腾硬件优化,在性能与效率上表现突出。 盘古还实现了在 推理时 做到又快又稳。 在技术特性上,盘古模型引入 "快思考" 和 "慢思考" 双系统,可根据问题复杂度自动切换响应模式,并在推理性能上实现突破——在昇腾 800I A2上单卡推理吞吐性能达1148 tokens/s,经投机加速技术可提升至1528 tokens/s,显著优于同等规模稠密模型。 那么让盘古、DeepSeek、Qwen这些开源的MoE模型在昇腾硬件上推理,能够达到易维护、高性能,并且全面开源的技术项目有没有呢? 现在,这个问题似乎有了标准答案—— 华为 一个新项目,直接把推理超大规模MoE背后的架构、技术和代码,统统给 开源了! 这个新开源项目名叫 Omni-Infer ,整体来看,它对于企业用户来说是非常利好的。 例如它可以给企业提供PD分离部署方案,针对QPM进行系统级优化,还会分享大规模商 ...
腾讯研究院AI速递 20250701
腾讯研究院· 2025-06-30 15:51
Group 1: OpenAI Custom Services - OpenAI has launched a custom AI consulting service starting at ten million dollars, with engineers assisting clients in model fine-tuning and application development [1] - The U.S. Department of Defense (contract worth $200 million) and Singapore's Grab are among the first clients, with services extending to military strategy and map automation [1] - This move positions OpenAI in competition with consulting firms like Palantir and may pose a threat to smaller startups focused on specific AI applications [1] Group 2: Gemini 2.5 Pro API - The Gemini 2.5 Pro API has returned to free usage, offering five requests per minute, 250,000 tokens per minute, and 100 requests per day [2] - Users can obtain an API Key by logging into Google AI Studio, creating the key, and saving it, with more lenient usage restrictions compared to OpenAI's o3 model [2] - The API can be accessed through third-party clients like Cherry Studio or Chatbox, supporting text Q&A, image analysis, and built-in internet search functions [2] Group 3: LeCun's PEVA World Model - LeCun's team has released the PEVA world model, achieving coherent scene prediction for 16 seconds, enabling embodied agents to possess human-like predictive capabilities [3] - The model combines 48-dimensional human joint kinematics data with conditional diffusion Transformers, trained using first-person perspective videos and full-body pose trajectories [3] - PEVA demonstrates intelligent planning abilities, selecting optimal solutions among multiple action options for complex tasks, outperforming baseline models by over 15% [3] Group 4: Huawei's Open Source Models - Huawei has open-sourced two large models: the 720 billion parameter mixed expert model "Pangu Pro MoE" and the 70 billion parameter dense model "Pangu Embedded 7B" [4][5] - The Pangu Pro MoE is trained using 4,000 Ascend NPUs, with an activated parameter count of 16 billion, achieving performance comparable to Qwen3-32B and GLM-Z1-32B models, with single-card inference throughput reaching 1,528 tokens/s [5] - The Pangu Embedded 7B employs a dual-system architecture of "fast thinking" and "slow thinking," automatically switching based on task complexity, outperforming similarly sized models like Qwen3-8B and GLM4-9B [5] Group 5: Baidu's Wenxin Model 4.5 Series - Baidu has officially open-sourced the Wenxin model 4.5 series, launching ten models with parameter scales ranging from a 47 billion mixed expert model to a 0.3 billion lightweight model, along with API services [6] - The series adopts the Apache 2.0 open-source protocol and introduces a multi-modal heterogeneous model structure, enhancing multi-modal understanding capabilities while maintaining high performance in text tasks [6] - The models have been benchmarked against DeepSeek-V3 and provide support through the ERNIEKit development suite and FastDeploy deployment suite [6] Group 6: Zhihu's Knowledge Base Upgrade - Zhihu has completed a significant upgrade to its knowledge base, allowing for public subscription and link sharing, deeply integrating with community content for an immersive reading experience [7] - The knowledge base capacity has expanded to 50GB, supporting various file formats for upload, and increasing exposure scenarios such as knowledge squares and personal homepages [7] - Zhihu has initiated an incentive program to encourage users to create and share vertical knowledge bases, with awards for "most valuable" and "prompt creativity," running until July 18 [7] Group 7: EVE 3D AI Companion - EVE is a 3D AI companion application designed with gamified elements, a favorability system, and interactive features, creating a strong sense of "human-like" presence and proactivity [8] - The AI can perform cross-dimensional interactions, such as delivering milk tea to users' homes and creating personalized songs, blurring the lines between virtual and real experiences [8] - EVE enhances the AI companionship experience through detailed expressions (emojis, trending topics) and a memory system, representing a significant breakthrough in the AI entertainment sector [8] Group 8: Apple's XR Devices - Apple is reportedly developing at least seven head-mounted devices, including three Vision series and four AI glasses, with the first AI glasses expected to launch in Q2 2027, targeting annual shipments of 3 to 5 million units [10] - The lightweight Vision Air is anticipated to begin mass production in Q3 2027, being over 40% lighter than the Vision Pro and significantly cheaper, while XR glasses with display features are expected by late 2028 [10] - The development of these devices is expected to ignite the AI glasses market, potentially exceeding 10 million units in sales [10] Group 9: Insights from Iconiq Capital's AI Report - A survey of 300 AI companies indicates a shift from conceptual hype to practical implementation, with OpenAI and Claude leading in enterprise AI selection, and nearly 90% of high-growth startups deploying intelligent agents [12] - The structure of AI spending shows that data storage and processing costs far exceed training and inference, with companies transitioning from traditional subscription models to usage-based hybrid pricing [12] - Among AI-native companies, 47% have reached critical scale, while only 13% of AI-enhanced companies have done so, with 37% of rapidly growing companies focusing on AI, making code intelligent agents the primary productivity application [12]
刚刚,华为盘古大模型5.5问世!推理、智能体能力大爆发
机器之心· 2025-06-20 11:59
Core Viewpoint - Huawei's Pangu model series emphasizes practical applications in various industries, focusing on intelligent upgrades and achieving significant market recognition through its iterations from Pangu 1.0 to Pangu 5.0 [2][3]. Group 1: Pangu Model 5.5 Release - Huawei officially launched Pangu Model 5.5 at the HDC 2025, showcasing its advanced natural language processing (NLP) capabilities and pioneering achievements in multimodal models [3][5]. - The upgraded Pangu 5.5 includes five foundational models targeting NLP, multimodal, prediction, scientific computing, and computer vision (CV), positioning itself as a core driver for industry digital transformation [4][46]. Group 2: NLP Models - Pangu 5.5 features three main NLP models: Pangu Ultra MoE, Pangu Pro MoE, and Pangu Embedding, along with an efficient reasoning strategy and the DeepDiver product [7]. - Pangu Ultra MoE is a near trillion-parameter model with 718 billion parameters, achieving domestic leadership and international competitiveness through innovative training methods [9][10]. - Pangu Pro MoE, with 72 billion parameters, ranked first domestically among models under 100 billion parameters in the SuperCLUE leaderboard, demonstrating its effectiveness in intelligent tasks [18][20]. - Pangu Embedding, a 7 billion parameter model, excels in knowledge, coding, mathematics, and dialogue capabilities, outperforming contemporaneous models [27][32]. Group 3: Technological Innovations - Huawei introduced adaptive fast-slow thinking technology in Pangu models, allowing for efficient problem-solving based on complexity, enhancing reasoning efficiency by up to 8 times [35]. - The DeepDiver model enhances high-level capabilities such as autonomous planning and exploration, achieving significant efficiency in complex question-answering tasks [41][44]. Group 4: Other Model Applications - Pangu 5.5 also includes models for scientific computing, industrial prediction, and computer vision, showcasing its versatility and potential for transformative applications across various sectors [46]. - The scientific computing model collaborates with the Shenzhen Meteorological Bureau to improve weather forecasting accuracy through AI integration [47]. - The CV model, with 30 billion parameters, supports diverse visual data analysis and decision-making, significantly enhancing operational capabilities in industrial scenarios [47].
昇腾万亿大模型验证国产AI基础设施!科创板人工智能ETF(588930)现涨0.54%,实时成交额突破3200万元
Mei Ri Jing Ji Xin Wen· 2025-06-04 02:44
Group 1 - The core viewpoint of the news highlights the launch of a new AI model, Pangu Ultra MoE, with a parameter scale of 718 billion, showcasing advancements in China's AI infrastructure and innovation capabilities [1] - The Pangu Pro MoE model, with 72 billion parameters and 16 billion active parameters, achieves performance comparable to models with hundreds of billions of parameters through innovative dynamic activation of expert networks [1] - The latest SuperCLUE ranking positions the Pangu model as the top domestic model within the sub-hundred billion parameter category, reinforcing confidence in China's AI industry development [1] Group 2 - The STAR Market AI ETF (588930) tracks an index comprising 30 leading AI companies, covering various sectors including electronics, computing, machinery, home appliances, and communications, with the top five constituents accounting for 47% of the index weight [2] - According to China International Capital Corporation (CICC), the domestic AI sector presents significant investment value, particularly in the rapidly growing AI companionship application area, where Chinese companies demonstrate unique advantages in product strength, technology iteration, and market expansion [2] - China boasts a leading digital talent pool, with 17% of the global total in 2023, which is 1.5 times that of the United States, providing solid support for AI companionship application development [2]
AI与机器人盘前速递丨智元旗下机器人同时获中美欧三方认证,华为推出准万亿模型盘古Ultra MoE
Mei Ri Jing Ji Xin Wen· 2025-06-03 01:09
Market Overview - On May 30, 2025, the Sci-Tech AI ETF Huaxia (589010) fell by 2.08%, with Zhonghuai Technology leading the decline at 7.44%, followed by Zhongke Xingtou at 7.31% and Youkede at 3.99% [1] - The Robot ETF (562500) also decreased by 2.01%, with Tianhuai Technology again leading the drop at 7.44%, followed by Kelaimechat at 6% and Dongjie Intelligent at 5.99% [1] - The trading volume for the day was 537 million yuan, making it the top ETF in its category, with a turnover rate of 4.21% [1] Key Developments - The YZ Robot's Expedition A2 humanoid robot has received certifications from China CR, EU CE-MD, EU CE-RED, and US FCC, making it the first humanoid robot globally to hold certifications from these three regions, and the first in China to obtain CR and CE-MD certifications [1] - Lingqiao Intelligent launched three new dexterous hand products, including the high dexterity DexHand021 Pro and two budget-friendly models, DexHand021 S and DexCap data acquisition system [1] - Lingqiao Intelligent was established in January 2024, incubated by the Shanghai AI Research Institute, and is headquartered in Xinchang, Zhejiang [1] Institutional Insights - CITIC Securities noted that the humanoid robot sector is experiencing differentiation, with previously high-performing stocks undergoing corrections. The market is focusing on low-valuation, safer investment opportunities in embodied intelligence applications [3] - The recommendation is to explore investment opportunities in "AI + Robotics" beyond just humanoid robots, including sensors, dexterous hands, robotic dogs, and exoskeleton robots [3] Popular ETFs - The Robot ETF (562500) is the only ETF in the market with a scale exceeding 10 billion, offering the best liquidity and comprehensive coverage of the Chinese robotics industry [4] - The Sci-Tech AI ETF Huaxia (589010) is described as the "brain" of robotics, with a 20% fluctuation range and the ability to capture "singularity moments" in the AI industry [4]
重磅!华为发布准万亿大模型
Mei Ri Jing Ji Xin Wen· 2025-05-30 11:41
Core Insights - Huawei has launched a new model called Pangu Ultra MoE, which has a parameter scale of 718 billion, marking a significant advancement in the MoE model training field on the Ascend AI computing platform [1][3][6] - The release of Pangu Ultra MoE and the Pangu Pro MoE series demonstrates Huawei's capability in achieving a fully controllable training process for domestic computing power and models, validating the innovation capacity of China's AI infrastructure [3][6] Model Architecture and Training Innovations - The Pangu team has introduced innovative designs in model architecture and training methods to address the challenges of training ultra-large-scale and highly sparse MoE models, achieving stable training on the Ascend platform [1][4] - Key innovations include the Depth-Scaled Sandwich-Norm (DSSN) architecture and TinyInit initialization method, which have enabled long-term stable training with over 18TB of data [4] - The introduction of the EP loss load optimization method ensures better load balancing among experts and enhances their specialization capabilities [4] Performance and Efficiency Improvements - The training methods disclosed by Huawei have enabled efficient integration of large sparse MoE reinforcement learning (RL) post-training frameworks on the Ascend CloudMatrix 384 supernodes [5] - Recent upgrades have improved the pre-training system's performance, increasing the multi-factor utilization (MFU) from 30% to 41% [5] - The Pangu Pro MoE model, with 72 billion parameters and 16 billion active parameters, has demonstrated performance comparable to larger models, ranking first among domestic models under 100 billion parameters in the SuperCLUE leaderboard [5] Industry Implications - The successful training and optimization of ultra-large-scale sparse models on domestic AI platforms signify a closed-loop of "full-stack domestication" and "fully controllable processes" from hardware to software, and from research to engineering [6] - This advancement provides a strong foundation for the development of China's AI industry, reinforcing confidence in domestic AI capabilities [3][6]
重大突破!刚刚,华为发布!
券商中国· 2025-05-30 10:43
Core Viewpoint - Huawei's launch of the Pangu Ultra MoE model, with a parameter scale of 718 billion, signifies a major advancement in China's AI industry, showcasing the capability for independent and controllable training processes on domestic computing platforms [1][4]. Group 1: Breakthroughs in Domestic Computing and Models - The training of ultra-large-scale and highly sparse MoE models is challenging, but Huawei's Pangu team has innovatively designed the model architecture and training methods to achieve stable training on the Ascend platform [2]. - The Pangu team introduced the Depth-Scaled Sandwich-Norm (DSSN) architecture and TinyInit initialization method, enabling long-term stable training with over 18TB of data [2]. - The EP loss optimization method ensures load balancing among experts and enhances their specialization capabilities, while the Pangu Ultra MoE employs advanced MLA and MTP architectures to balance model performance and efficiency [2][3]. Group 2: Training Method Innovations - Huawei's team has disclosed key technologies that enable efficient training of large sparse MoE models on the Ascend CloudMatrix 384 supernodes, marking a transition to a supernode cluster era for reinforcement learning (RL) post-training frameworks [3]. - Recent upgrades to the pre-training system have improved the efficiency of the MFU in large clusters from 30% to 41% [3]. - The Pangu Pro MoE model, with 72 billion parameters and 16 billion active parameters, demonstrates exceptional performance that rivals larger models through innovative dynamic activation of expert networks [3]. Group 3: Industry Developments - DeepSeek's R1 model has completed a minor version upgrade, outperforming Western competitors in several standardized metrics while maintaining a low cost of only a few million dollars [5]. - Tencent's AI model strategy has been fully unveiled, with the Mix Yuan model achieving a ranking among the top eight globally on the Chatbot Arena platform, showcasing its continuous technological advancements [6].