Workflow
大模型安全
icon
Search documents
GPT之父Alec Radford新作:给大模型做「脑部手术」,危险知识重学成本暴增7000倍
机器之心· 2026-03-01 03:34
编辑|Panda Alex Radford, 出生于1993 年 4 月,即将 33 岁 ,但已经拥有超过 32 万的引用量。因为这位「独立研究员」不仅是 GPT、GPT-2 和 CLIP 的第一作者,同时还参与 了 GPT-3、GPT-4、PPO 算法等多个重大研究项目。 | Alec Radford | | FOLLOW | GET MY OWN PROFILE | | | | --- | --- | --- | --- | --- | --- | | Independent | | | | | | | No verified email | | | | | | | Deep Learning Machine Learning | | | Cited by | | VIEW ALL | | | | | All | | Since 2021 | | TITLE | CITED BY | YEAR | Citations 322847 | | 297276 | | | | | h-index 50 | | 50 | | | | | i10-index 61 | | 61 | | Language models ...
安恒信息:公司明鉴大模型风险评估系统重点聚焦大模型运行环境基础安全等三大核心领域
Zheng Quan Ri Bao· 2026-02-27 13:12
证券日报网讯 2月27日,安恒信息在互动平台回答投资者提问时表示,公司明鉴大模型风险评估系统重 点聚焦大模型运行环境基础安全、模型输出内容安全、训练语料安全三大核心领域,全面覆盖开发、训 练、部署、应用全流程,通过"探测-分析-加固-运营"一体化技术架构,为用户提供全链路风险检测与闭 环管理服务。 (文章来源:证券日报) ...
天融信(002212.SZ):暂未向字节跳动Seedance 2.0提供安全防护
Ge Long Hui· 2026-02-27 06:25
格隆汇2月27日丨天融信(002212.SZ)在投资者互动平台表示,公司暂未向字节跳动Seedance 2.0提供安全 防护,公司可为客户提供大模型安全网关、大模型数据安全监测、内容智能管控、大模型安全评估等系 列创新产品和解决方案。 ...
马年!天融信大模型安全网关焕新升级
Jin Rong Jie· 2026-02-26 01:27
Group 1 - The core viewpoint of the articles highlights the rapid adoption and integration of AI large models in various sectors, particularly during the Spring Festival, showcasing their ability to enhance user experience through seamless service delivery [1][3] - AI large models are being applied in critical areas such as government, finance, energy, and healthcare, providing efficient support for compliance, risk management, intelligent services, and decision-making [3] - The emergence of new attack methods targeting large models, including data poisoning, model theft, and privacy breaches, raises concerns about balancing innovation and security [3] Group 2 - TopLMG has undergone a comprehensive upgrade to address new security challenges associated with data, models, and algorithms, establishing a full lifecycle security protection system [4] - The system includes multi-modal risk identification capabilities, allowing for precise detection of various content types, including text, documents, images, audio, and video [6] - Enhanced proxy mode supports HTTP/2.0 protocol, significantly improving transmission efficiency and reducing network latency, while ensuring compliance with domestic security standards [9] Group 3 - The multi-round semantic deep protection feature enables context-based attack detection, effectively identifying covert attack behaviors [10] - Full-process dialogue management allows for complete recording of dialogue inputs and outputs, facilitating information management and communication efficiency [11] - The API asset intelligent risk control system automatically organizes API assets, implements compliance checks, and prevents malicious calls, ensuring stable business operations [12] Group 4 - The compliance audit and traceability function provides multiple protective measures, including digital watermarks and metadata identification, to mitigate deep forgery risks [13] - The company emphasizes the importance of technological innovation and industry collaboration to ensure the stable and sustainable development of AI large models [13]
以AI对抗AI,让大模型健康发展
Xin Lang Cai Jing· 2026-01-28 22:02
Core Viewpoint - The rise of AI large models has sparked concerns over security risks, including personal data leakage and potential misuse in various applications, necessitating a robust protective framework for safe deployment [1][5]. Group 1: Security Risks - AI large models pose multiple security vulnerabilities, including prompt injection, sensitive information leakage, and data poisoning, as identified by OWASP [2]. - Everyday actions, such as uploading photos for AI enhancement, can lead to sensitive information leakage, enabling identity theft and fraud [3]. - Data poisoning can severely compromise model integrity, with as few as 250 malicious documents capable of contaminating a model with billions of parameters [3]. Group 2: Business Implications - Companies face significant risks from AI model vulnerabilities, with potential impacts on core operations and data integrity [4]. - The need for thorough data cleansing and verification processes is emphasized to prevent "data pollution" and ensure reliable outputs from AI models [4]. - In professional settings, risks such as core data leakage and exploitation of open-source model vulnerabilities can lead to substantial economic losses and missed opportunities [4]. Group 3: Regulatory Challenges - Current regulations primarily focus on AI-generated content review, lacking clear definitions and penalties for emerging threats like data poisoning [5]. - The opaque nature of AI models complicates accountability, making it difficult to assign responsibility in case of errors or breaches [5]. Group 4: Protective Measures - A comprehensive security framework is being developed in Jiangsu, including policies that incentivize compliance and provide support for security assessments [7]. - Companies are implementing multi-layered security measures, such as asynchronous recognition engines and three-tier review mechanisms, to enhance data protection [7][8]. - Continuous training and monitoring of AI models are essential to mitigate risks, with suggestions for creating dedicated security models to oversee operational models [9][10]. Group 5: Collaborative Governance - A multi-stakeholder approach is recommended for effective governance, involving government, enterprises, research institutions, and third-party evaluators to enhance security and compliance [10]. - The establishment of a shared information platform and clear accountability mechanisms is crucial for fostering a collaborative environment in AI security governance [10].
天融信:目前腾讯元宝暂未应用到公司产品
Zheng Quan Ri Bao Wang· 2026-01-28 14:11
Group 1 - The company Tianrongxin (002212) is engaging in deep cooperation with Tencent across multiple areas including threat intelligence, large model security, cloud security, privacy computing, and smart cities [1] - Currently, Tencent's Yuanbao has not been applied to the company's products, but the company has initiated collaboration with Tencent's Hongyuan large model [1]
第一梯队的大模型安全吗?复旦、上海创智学院等发布前沿大模型安全报告,覆盖六大领先模型
机器之心· 2026-01-22 04:05
Core Insights - The article discusses the evolving safety assessment framework for advanced large models, particularly focusing on their security capabilities in various application scenarios and regulatory contexts [2][6]. Group 1: Safety Assessment Framework - A unified safety assessment framework has been developed for six leading models: GPT-5.2, Gemini 3 Pro, Qwen3-VL, Grok 4.1 Fast, Nano Banana Pro, and Seedream 4.5, covering language, visual language, and image generation scenarios [2]. - The assessment integrates four key dimensions: baseline safety, adversarial testing, multilingual evaluation, and compliance evaluation against global regulatory frameworks [4]. Group 2: Key Findings - GPT-5.2 achieved an average safety rate of 78.39%, demonstrating a shift towards deep semantic understanding and value alignment, significantly reducing failure risks under adversarial inputs [11]. - Gemini 3 Pro's average safety rate is 67.9%, showing strong but uneven safety characteristics, with a notable drop in adversarial robustness [11]. - Qwen3-VL scored an average safety rate of 63.7%, excelling in compliance but showing weaknesses in adversarial safety [12]. - Grok 4.1 Fast has an average safety rate of 55.2%, with significant variability in performance across different assessments [12]. Group 3: Multimodal Safety - GPT-5.2 leads with an average multimodal safety rate of 94.69%, indicating high stability in complex cross-modal scenarios [13]. - Qwen3-VL follows with an average safety rate of 81.11%, showing strong performance in visual-language interaction [13]. Group 4: Model Safety Profiles - GPT-5.2 is characterized as an all-encompassing internalized model, capable of nuanced compliance guidance in complex contexts [19]. - Qwen3-VL is identified as a rule-compliant model, excelling in clear regulatory environments but lacking flexibility in ambiguous scenarios [20]. - Gemini 3 Pro is described as an ethical interaction model, sensitive to social values but needing improvement in proactive risk prevention [21]. - Grok 4.1 Fast is noted for its efficiency-focused design, prioritizing user expression over robust defense mechanisms [22]. Group 5: Challenges in Security Governance - The report highlights the threat of multi-round adaptive attacks, which can bypass static defenses, posing a significant challenge for future model safety governance [27]. - There is a structural imbalance in security performance across languages, with a 20%-40% drop in non-English contexts, raising concerns about global deployment risks [28]. - The lack of transparency and explainability in decision-making processes remains a critical governance shortcoming, particularly in high-risk areas [29]. Conclusion - The report emphasizes the need for a collaborative approach among academia, industry, and regulatory bodies to develop a comprehensive and dynamic safety assessment system for generative AI [30].
亚信安全:大模型安全产品主要为大模型提供端到端安全防护
Zheng Quan Ri Bao Wang· 2026-01-15 10:11
Core Viewpoint - The company, AsiaInfo Security, is actively engaged in the large model sector, providing security and management solutions, as well as a platform and toolset for large models, targeting various industries including finance, telecommunications, government, healthcare, education, and energy [1] Group 1 - AsiaInfo Security and its subsidiary, AsiaInfo Technology, focus on offering end-to-end security protection for large models [1] - The subsidiary has achieved successful implementation of large model applications across multiple industries [1] - Revenue details related to these products will be disclosed in the company's regular reports [1]
云天励飞:公司与360集团签署战略合作协议
Zheng Quan Ri Bao Wang· 2026-01-06 12:13
Core Viewpoint - Yuntian Lifei has signed a strategic cooperation agreement with 360 Group to develop a collaborative AI inference ecosystem under the domestic framework, focusing on "AI + security" as the core of their multi-dimensional cooperation [1] Group 1: Strategic Cooperation - The partnership will leverage both companies' strengths in resources, scenarios, and technology to build a robust AI computing foundation, enhance large model security capabilities, and create smart living products [1] - Yuntian Lifei's DeepEdge and DeepXbot series chip capabilities will be combined with 360's smart hardware matrix to develop innovative and secure products [1] Group 2: AI Security and Efficiency - The collaboration aims to empower the 360 AI platform service capabilities with DeepVerse, improving inference efficiency and deployment flexibility within the domestic ecosystem [1] - Joint research will focus on AI security protection capabilities for intelligent agents, exploring new products, scenarios, and models that integrate large model security with smart living [1]
大模型易被恶意操控,安全治理迫在眉睫
Zhong Guo Jing Ji Wang· 2025-12-23 02:26
Group 1 - The core issue highlighted is the frequent security vulnerabilities in large AI models, indicating that technological advancement must be accompanied by security measures [4] - The article discusses the risk of "data poisoning" and indirect prompt injection attacks, which can lead to the manipulation of model outputs and the potential theft of sensitive data [4] - It emphasizes that the security of large models is not merely a technical issue but a systemic challenge related to public safety, necessitating a proactive approach in model development, data training, and deployment [4] Group 2 - The industry is urged to prioritize security in the development and application of AI models to build robust defenses against potential threats [4] - The article suggests that a dual focus on technology and security is essential to prevent AI from "losing control" during rapid advancements [4]