Workflow
智能体安全合规
icon
Search documents
给大热的智能体做体检:关键「安全」问题能达标吗?
21世纪经济报道· 2025-07-04 06:55
作 者丨 肖潇、王俊、章驰、陈勇杰 编 辑丨王俊 2025年,被称为"智能体元年"。这是AI发展路径上的一次范式突变:从"我说AI答"到"我说AI 做",从对话生成跃迁到自动执行,智能体正成为最重要的商业化锚点和下一代人机交互范 式。 但越接近落地,风险也越有实感。智能体的核心能力——自主性、行动力,也恰恰是风险滋 生的窗口。越能干的智能体,越可能越权、越界,甚至失控。 结合调查问卷和行业访谈,本次《智能体体检报告——安全全景扫描》 从最新发展状况、合 规认知度、合规实际案例三个角度 ,试图回答清楚一个关键问题:智能体狂奔之时,安全合 规是否就绪了? 容错性与自主性为核心考量指标 作为市场最火热的概念,今年资本市场及公司动态几乎都与智能体挂钩。但不少讨论中的智 能体定义混乱,以至于一千个人眼中有一千个智能体。 如果仅停留在单一角度分类智能体是非常片面的,为了更全景地认知理解智能体,我们广泛 地调研了从业者,认为可以从"容错性"、"自主性"两个维度划定坐标轴,建立智能体的价值生 态。 X轴是"容错性",我们认为这是智能体未来发展的核心竞争指标 。容错性低,通常意味着出 现错误后果严重,其使用场景需要更准确的信息 ...
智能体不断进化,协作风险升高:五大安全问题扫描
Core Insights - The year 2025 is anticipated to be the "Year of Intelligent Agents," marking a paradigm shift in AI development from conversational generation to automated execution, positioning intelligent agents as key commercial anchors and the next generation of human-computer interaction [1] Group 1: Development and Risks of Intelligent Agents - As intelligent agents approach practical application, the associated risks become more tangible, with concerns about overreach, boundary violations, and potential loss of control [2] - A consensus exists within the industry that the controllability and trustworthiness of intelligent agents are critical metrics, with safety and compliance issues widely recognized as significant [2] - Risks associated with intelligent agents are categorized into internal and external security threats, with internal risks stemming from vulnerabilities in core components and external risks arising from interactions with external protocols and environments [2] Group 2: AI Hallucinations and Decision Errors - Over 70% of respondents in a safety awareness survey expressed concerns about AI hallucinations and erroneous decision-making, highlighting the prevalence of factual inaccuracies in AI-generated content [2] - In high-risk sectors like healthcare and finance, AI hallucinations could lead to severe consequences, exemplified by a hypothetical 3% misdiagnosis rate in a medical diagnostic agent potentially resulting in hundreds of thousands of misdiagnoses among millions of users [2] Group 3: Practical Applications and Challenges - Many enterprises have found that intelligent agents currently struggle to reliably address hallucination issues, leading some to abandon AI solutions due to inconsistent performance [3] - A notable case involved Air Canada's AI customer service, which provided incorrect refund information, resulting in the company being held legally accountable for the AI's erroneous decision [3] Group 4: Technical Frameworks and Regulations - Intelligent agents utilize various technical bridges to connect with the external world, employing two primary technical routes: an "intent framework" based on API cooperation and a "visual route" that bypasses interface authorization barriers [4] - Recent evaluations have highlighted chaotic usage of accessibility permissions by mobile intelligent agents, raising significant security concerns [5] Group 5: Regulatory Developments - A series of standards and initiatives have emerged in 2024 aimed at enhancing the management of accessibility permissions for intelligent agents, emphasizing user consent and risk disclosure [6] - The standards, while not mandatory, reflect a growing recognition of the need for safety in the deployment of intelligent agents [6] Group 6: Security Risks and Injection Attacks - Prompt injection attacks represent a core security risk for all intelligent agents, where attackers manipulate input prompts to induce the AI to produce desired outputs [7][8] - The emergence of indirect prompt injection risks, particularly with the rise of MCP (Multi-Channel Protocol) tools, poses new challenges as attackers can embed malicious instructions in external data sources [8][9] Group 7: MCP Services and Security Challenges - The MCP service Fetch has been identified as a significant entry point for indirect prompt injection attacks, raising concerns about the security of external content accessed by intelligent agents [10] - The lack of standardized security certifications for MCP services complicates the assessment of their safety, with many platforms lacking rigorous review processes [11] Group 8: Future of Intelligent Agent Collaboration - The development of multi-agent collaboration mechanisms is seen as crucial for the practical deployment of AI, with various companies exploring the potential for intelligent agents to work together on tasks [12][13] - The establishment of the IIFAA Agent Security Link aims to provide a secure framework for collaboration among intelligent agents, addressing issues of permissions, data, and privacy [14]