浙江大学联合华为发布国内首个基于昇腾千卡算力平台的 DeepSeek-R1-Safe 基础大模型
AI前线·2025-09-21 05:32

Core Viewpoint - The article emphasizes the rapid evolution of large models in artificial intelligence (AI) and their significance as indicators of national innovation capability and comprehensive national strength. It highlights the security challenges posed by these models, particularly in the context of national security and public interest [2][3]. Group 1: Current State of AI Models - As of January 2025, there are approximately 197 large models in the Chinese market, covering key industries such as finance, healthcare, education, manufacturing, automotive, and energy [2]. - Global large models face security issues, including the generation of false/harmful content, data bias, and information leakage, which pose significant threats to national information security [2]. Group 2: Security Challenges and Responses - Domestic platforms face challenges in framework completeness, developer community maturity, and open-source ecosystem development, with some early versions of domestic large models showing a jailbreak failure rate of up to 100% [3]. - Zhejiang University and Huawei have launched the DeepSeek-R1-Safe foundational model, which has improved security defense capabilities to 83%, a 115% increase compared to the original model [3][5]. Group 3: Technical Innovations - DeepSeek-R1-Safe incorporates breakthroughs in three dimensions: "secure corpus construction," "secure model training," and "hardware and software environment setup" [4][5]. - The model's training process is fully deployed on the domestic Ascend Kunpeng cluster, utilizing 128 servers and a total of 1024 Ascend AI cards, marking a significant achievement in large-scale security training [9][10]. Group 4: Performance Metrics - DeepSeek-R1-Safe demonstrates nearly 100% success in defending against ordinary harmful issues across 14 dimensions, outperforming several contemporaneous models by 4% to 13% [10][12]. - The model's jailbreak defense capability exceeds 40% against various jailbreak modes, surpassing contemporaneous models by 16% to 23% [13][15]. Group 5: Future Directions - The team aims to promote the development of endogenous secure AI in collaboration with Huawei and other industry partners, focusing on achieving comprehensive autonomy, security, and controllability in AI models [18].