深知风控框架
Search documents
AI安全破局:深知发布智能体专用安全模型,实现对话风险近100%防御,破解AGI应用合规难题
3 6 Ke· 2025-11-24 08:21
Core Viewpoint - The increasing integration of generative AI into daily life is accompanied by a hidden security crisis, as dialogue risks such as malicious inducement and hidden conditions pose significant challenges to the industry [1] Group 1: Security Testing Results - A security test conducted by the Ministry of Public Security's Third Research Institute revealed that the non-compliance rate across eight security dimensions for mainstream generative AI models ranges from 28% to 51%, with categories like organized crime, rumors, and fraud exceeding 40% [1] - Specific models such as Hunyuan-TurboS and Moonshot-V1-128K showed non-compliance rates of 34.93% and 37.67% respectively in national security and violence-related categories [2] Group 2: Challenges in Security Measures - Existing defense mechanisms, such as sensitive word rules, are inadequate against new AI attack methods, leading to missed detections and false positives [2] - Regulatory policies like the "Basic Requirements for the Security of Generative AI Services" have set boundaries for risk control, complicating the task for developers to address dialogue security risks effectively [2] Group 3: DeepKnown's Security Framework - DeepKnown has developed a proprietary model-based dialogue security response framework called "DeepKnown Risk Control," which offers a breakthrough solution that does not compromise the model's capabilities [3] - The framework allows developers to achieve nearly 100% security risk defense capability within five minutes of integration [3] Group 4: Performance Metrics - DeepKnown demonstrated superior performance in risk identification and response accuracy compared to leading safety models like Qwen3Guard-Gen-8B and TinyR1-Safety-8B [4] - In tests against high-risk scenarios, DeepKnown achieved close to 100% high-risk protection, while similar models scored only 74% due to reliance on static knowledge [8] Group 5: Risk Classification System - DeepKnown has restructured security logic to establish a four-category risk classification system: Safe, Unsafe, Conditionally Safe, and Focus, allowing for targeted risk management [9] - This system enables more nuanced handling of risks, avoiding the binary classification of safe/unsafe that often leads to over-blocking or missed detections [9] Group 6: Knowledge Base and Response Models - DeepKnown provides a comprehensive knowledge base covering laws, policies, and standards across 337 cities, ensuring responses are compliant and traceable [11] - Two response modes are offered: Active for general interactions and Conservative for sensitive scenarios, ensuring safety while maintaining engagement [11] Group 7: Application Value - DeepKnown's API interface allows for easy integration into existing systems, significantly lowering the cost of risk management for developers [12][16] - The service transforms complex security technology into a low-threshold, on-demand service, enabling businesses to focus on innovation rather than security concerns [16] Group 8: Conclusion - As generative AI becomes mainstream, security is no longer an optional feature but a necessity for successful deployment in various sectors [17] - DeepKnown's innovative approach to security, with nearly 100% high-risk defense results, positions it as a critical enabler for the large-scale application of AI across industries [17]