登顶SuperCLUE DeepSearch,openPangu-R-72B深度搜索能力跃升
机器之心·2025-12-05 10:17

Core Insights - The article highlights the rapid development of large model inference and agent tool capabilities, with a focus on the recent SuperCLUE DeepSearch evaluation report, where the domestic model openPangu-R-72B ranked first in complex information retrieval tasks, showcasing the strength of domestic Ascend computing power in large model development [1][15]. Model Performance - In the SuperCLUE DeepSearch evaluation, openPangu-R-72B achieved a score of 73.33, outperforming other models such as Gemini-3-Pro-Preview and GPT-5.1(high), which scored 70.48 [2]. - The model excelled in various task categories, particularly in humanities and social sciences (75.47) and natural sciences (83.33) [2]. Technical Architecture - openPangu-R-72B is based on a redesigned architecture that balances efficiency and performance, utilizing a mixture of experts (MoE) model with an 80 out of 8 expert selection mechanism, maintaining 15 billion active parameters from a total of 74 billion [4]. - The model was trained on 24 trillion tokens and can handle long sequences of up to 128k, which is crucial for deep search tasks [4]. Optimization Techniques - The model incorporates several optimizations, including the introduction of parameterized Sink Token technology to stabilize training and enhance quantization compatibility [7]. - It employs a combination of K-Norm and Depth-Scaled Sandwich-Norm architectures to reduce computational overhead while maintaining stability and flexibility in expression [7]. - The attention architecture has been optimized for precision and efficiency, achieving a 37.5% reduction in KV cache while enhancing the model's ability to capture fine-grained semantic relationships [7][8]. DeepSearch Capabilities - The model's success in deep search tasks is attributed to three key strategies: long-chain question answering synthesis, non-indexed information processing, and a fast-slow thinking integration approach [10]. - The long-chain QA synthesis improved the average difficulty of questions by 10% and introduced a verification agent to enhance training accuracy [12]. - The model's workflow includes a cycle of focusing on key URLs, crawling, and document QA to gather deep information beyond traditional search engine capabilities [12]. Domestic Computing Power - The achievement of openPangu-R-72B in the SuperCLUE DeepSearch evaluation underscores the effective integration of domestic computing power with large model research and development [15]. - The model's sibling, openPangu-718B, also performed well, securing the second position in the general ranking, indicating the comprehensive capabilities of the openPangu series across different task scenarios [15].