生成式人工智能安全
Search documents
联合实验室是湾区AI生态“黏合剂”
Nan Fang Du Shi Bao· 2025-09-15 23:09
Core Viewpoint - The Greater Bay Area is experiencing active technological innovation in the field of generative AI security, with the establishment of a joint laboratory playing a dual role as an ecosystem adhesive and a testing ground for the AI industry [2][4]. Group 1: Technological Innovation - The Greater Bay Area has attracted significant talent and investment, leading to notable advancements in model safety, trustworthiness, and quantitative assessment [4]. - A relatively complete industrial chain has begun to form, encompassing basic research, technology application, and security assurance, although there is still a need to improve the balance between application and security, as well as technology and standards [4]. Group 2: Localized Security Assessment System - The proposed localized security assessment system aims to provide targeted evaluations based on the actual business scenarios, data characteristics, and technical architecture of AI companies in the Greater Bay Area, helping to identify potential security risks [4]. - The establishment of local security standards will create unified regulations for AI industry development, clarifying safety baselines and development directions [4]. Group 3: Collaborative Governance - Effective AI governance requires coordination among government, industry, academia, and users, focusing on core security needs to build a collaborative system [5]. - The government can leverage institutional innovation advantages to set clear safety requirements and risk classification standards, while enterprises should prioritize safety indicators throughout the entire process of demand assessment, data governance, and model training [5]. Group 4: Future Expectations - The joint laboratory is expected to become a leading platform for AI security research and innovation, enhancing regional small and medium-sized enterprises' overall security capabilities through open-source sharing and ecological incubation [8]. - There is a suggestion to establish an AI security incident response and tracing center to improve the resilience and systematic response capabilities of the Greater Bay Area in the face of sudden AI risks [8]. Group 5: Unique Role of the Joint Laboratory - The joint laboratory serves as an adhesive by integrating various core advantages and promoting deep collaboration among government, industry, academia, and users, facilitating a seamless connection from algorithm optimization to risk identification and application protection [9]. - It acts as a testing ground for the industry, exploring safe innovation paths through full-chain algorithm exploration and application validation, while also advancing cross-border financial security testing and data compliance [9].
企业 GenAI 的最大风险以及早期使用者的经验教训
3 6 Ke· 2025-08-11 00:20
Overview - Generative AI is included in corporate roadmaps, but companies should not release any unsafe products. The threat model has changed due to LLMs, where untrusted natural language can become an attack surface, and outputs can be weaponized. Models should operate in a sandboxed, monitored, and strictly authorized environment [1][2] Security Challenges - Immediate injection attacks, including indirect attacks hidden in files and web pages, are now a top risk for LLMs. Attackers can compromise inputs without breaching backend systems, leading to data theft or unsafe operations [4][5] - Abuse of agents/tools and "over-proxying" create new permission boundaries. Overly permissive agents can be lured into executing powerful operations, necessitating strict RBAC and human approval for sensitive actions [4][5] - RAG (Retrieval-Augmented Generation) introduces new attack surfaces, where poisoned indexes can lead to adversarial outputs. Defensive measures are still evolving [4][5] - Privacy leaks and IP spillage are active research areas, with large models sometimes memorizing sensitive training data. Improvements in vendor settings are ongoing [4][5] - The AI supply chain is vulnerable, with risks from backdoored models and deceptive alignments. Organizations need robust provenance and behavior review measures [4][5] - Unsafe output handling can lead to various security issues, including XSS and SSRF attacks. Strict output validation and execution policies are essential [4][5] - DoS attacks and cost abuse can arise from malicious workloads, necessitating rate limits and alert systems [4][5] - Observability and compliance challenges exist, requiring structured logging and change control while adhering to privacy laws [4][5] - Governance drift and model/version risks arise from frequent updates, emphasizing the need for continuous security testing and version control [4][5] - Content authenticity and downstream misuse remain concerns, with organizations encouraged to track output provenance [4][5] Action Plan for Next 90 Days - Conduct a GenAI security and privacy audit to identify sensitive data entry points and deploy immediate controls [6][7] - Pilot high-value, low-risk use cases to demonstrate value while minimizing customer risk [6][7] - Implement evaluation tools with human review and key metrics before widespread deployment [6][7] Case Studies - JPMorgan Chase implemented strict prompts and a code snippet checker to prevent sensitive data leaks in their AI coding assistant, resulting in zero code leak incidents by 2024 [16] - Microsoft enhanced Bing Chat's security by limiting session lengths and improving prompt isolation, significantly reducing successful prompt injection attempts [17] - Syntegra utilized differential privacy in their medical AI to prevent the model from recalling sensitive patient data, ensuring compliance with HIPAA [18] - Waymo employed a model registry to ensure the security of their machine learning supply chain, successfully avoiding security issues over 18 months [19][20] 30-60-90 Day Action Plan - The first 30 days should focus on threat modeling workshops and implementing basic input/output filtering [22][23] - The next 31-60 days should involve red team simulations and the deployment of advanced controls based on early findings [24][25] - The final phase (61-90 days) should include external audits and optimization of monitoring metrics to ensure ongoing compliance and security [27][28]