AI安全

Search documents
250份文档投毒,一举攻陷万亿LLM,Anthropic新作紧急预警
3 6 Ke· 2025-10-10 23:40
Core Insights - Anthropic's latest research reveals that only 250 malicious web pages are sufficient to "poison" any large language model, regardless of its size or intelligence [1][4][22] - The experiment highlights the vulnerability of AI models to data poisoning, emphasizing that the real danger lies in the unclean world from which they learn [1][23][49] Summary by Sections Experiment Findings - The study conducted by Anthropic, in collaboration with UK AISI and the Alan Turing Institute, found that any language model can be poisoned with just 250 malicious web pages [4][6] - The research demonstrated that both small (600 million parameters) and large models (13 billion parameters) are equally susceptible to poisoning when exposed to these documents [16][22] - The attack success rate remains nearly 100% once a model has encountered around 250 poisoned samples, regardless of its size [19][22] Methodology - The research team designed a Denial-of-Service (DoS) type backdoor attack, where the model generates nonsensical output upon encountering a specific trigger phrase, <SUDO> [7][8] - The poisoned training documents consisted of original web content, the trigger phrase, and random tokens, leading to the model learning a dangerous association [25][11] Implications for AI Safety - The findings raise significant concerns about the integrity of AI training data, as the models learn from a vast array of publicly available internet content, which can be easily manipulated [24][23] - The experiment serves as a warning that the knowledge AI acquires is influenced by the chaotic and malicious elements present in human-generated content [49][48] Anthropic's Approach to AI Safety - Anthropic emphasizes a "safety-first" approach, prioritizing responsible AI development over merely increasing model size and performance [31][45] - The company has established a systematic AI safety grading policy, which includes risk assessments before advancing model capabilities [34][36] - The Claude series of models incorporates a "constitutional AI" method, allowing the models to self-reflect on their outputs against human-defined principles [38][40] Future Directions - Anthropic's focus on safety and reliability positions it uniquely in the AI landscape, contrasting with competitors that prioritize performance [45][46] - The company aims to ensure that AI not only becomes smarter but also more reliable and aware of its boundaries [46][50]
斗象科技谢忱:十年蝶变 从白帽平台到AI安全云平台
Shang Hai Zheng Quan Bao· 2025-10-09 18:39
谢忱 彼时,谢忱白天在互联网大厂工作,晚上翻译资料"搬文献",他所创办运营的网络安全门户FreeBuf逐 步吸引了中国第一批白帽用户,也为斗象后来以漏洞众测平台为主的初期业务打下基础。 所谓"白帽",即"白帽黑客",也就是通过黑客技术为系统进行安全漏洞排查的网络安全从业者。2014 年,谢忱创立斗象科技,正式建立平台"漏洞盒子",成为国内最早倡导安全众包服务商业化的企业之 一。 ◎陈铭 记者 邓贞 当人工智能成为市场焦点、大模型与智能体深入千行百业,"安全"这一隐形底座的重要性愈发凸显。 在今年的世界人工智能大会(WAIC)上,斗象科技创始人、董事长兼CEO谢忱用两个"失控"概括企业 在AI时代面临的安全挑战:一是对物理世界的失控;二是推理过程不透明带来的失控。 2014年,谢忱创办了国内最早的白帽技术社区。此后十年,从搭建服务数千家企业的漏洞众测与在线安 全服务平台,到打造安全垂类大模型与智能安全云平台——在一些企业还未意识到"AI安全"意味着什么 时,斗象已走出了一条独特的网络安全进化路径。 站在AI浪潮与安全技术升级交汇的当前,上海证券报记者与这位年轻的创始人聊了聊:大模型时代, 什么是网络安全厂商真正 ...
AI技术降9成跨境盗刷风险,“电子钱包守护者联盟” 来了
Yang Zi Wan Bao Wang· 2025-10-09 03:35
蚂蚁国际风险管理与网络安全事业部总经理张天翼表示,该体系在试运行期间已有效降低90%的账户盗 用风险。同时,联盟为用户提供"未授权交易全额赔付"保障计划。相关赔付申请由AI智能审批系统处 理,审核效率提升90%,准确率高达95%以上。 2024年,蚂蚁国际处理的全球交易额超1万亿美元,背后均有AI技术支撑。 在人工智能技术加速渗透的当下,AI安全已成为金融科技创新的前提。据《欧洲未来研究期刊》统 计,全球每年因AI安全隐患造成的潜在损失高达570亿美元,而仅有5%的企业对其AI防护能力充满信 心。 扬子晚报网讯(记者 徐晓风)近期,在中国人民银行的指导下,中国支付清算协会全面开启建设跨境 二维码统一网关(以下简称"统一网关")相关工作。随着蚂蚁集团成为首批跨境统一网关试点机构,蚂 蚁国际联合多个主要客源地头部电子钱包发起"电子钱包守护者联盟"(Digital Wallet Guardian Partnership),并正式发布AI安全防护核心系统——AI SHIELD。该体系通过可信人工智能技术,显著 降低跨境支付风险,试运行期间已成功拦截90%的账户盗用行为,为跨境消费提供坚实的安全保障,助 力"一部手机游 ...
中国00后AI创业,“第一天就瞄准出海”
2 1 Shi Ji Jing Ji Bao Dao· 2025-09-25 04:53
Core Insights - A new opportunity era for "unknowns" is emerging, with China's post-2000 AI entrepreneurs striving for global recognition [10] - The AI industry is witnessing a shift where language barriers are diminished, allowing for a more global approach to AI entrepreneurship [6][7] Group 1: AI Entrepreneurship Landscape - Young AI entrepreneurs, including university students, are actively participating in financing roadshows to secure seed funding for their AI products [1][2] - Antler, a prominent early-stage investment firm, has invested in over 1,300 companies, ranking first among global AI early investors [2] - EPIC Connector, a non-profit AI startup incubator, aims to assist Chinese AI entrepreneurs in expanding internationally [3][4] Group 2: Globalization and AI - The majority of participants at the AI DEMO Day were fluent in English, indicating a readiness to engage with global markets [4][3] - A report from Macro Polo highlights that 47% of top global AI researchers are from China, showcasing the significant role of Chinese talent in the AI sector [4] - The concept of "Day One Global" is emphasized, suggesting that Chinese AI startups should consider international markets from the outset [7] Group 3: Challenges and Trends - The AI industry faces challenges related to reverse globalization, with some companies relocating to avoid restrictions [8][9] - The recent actions of Manus.AI, including layoffs and relocation, reflect the complexities of operating in a global AI landscape [8] - The distinction between models and agents in AI entrepreneurship is becoming blurred, leading to more specialized and user-focused AI products [11][12] Group 4: Future Outlook - The Chinese government's recent initiatives to promote AI integration across various sectors signal a supportive environment for AI development [10] - The rise of AI is compared to the internet boom two decades ago, suggesting a transformative potential for the digital economy [10] - EPIC Connector aims to elevate promising but lesser-known entrepreneurs to the forefront of the AI industry [12]
国内首个大模型“体检”结果发布,这样问AI很危险
3 6 Ke· 2025-09-22 23:27
Core Insights - The recent security assessment of AI large models revealed 281 vulnerabilities, with 177 being specific to large models, indicating new threats beyond traditional security concerns [1] - Users often treat AI as an all-knowing advisor, which increases the risk of privacy breaches due to the sensitive nature of inquiries made to AI [1][2] Vulnerability Findings - Five major types of vulnerabilities were identified: improper output vulnerabilities, information leakage, prompt injection vulnerabilities, inadequate defenses against unlimited consumption attacks, and persistent traditional security vulnerabilities [2] - The impact of large model vulnerabilities is less direct than traditional system vulnerabilities, often involving circumvention of prompts to access illegal or unethical information [2][3] Security Levels of Domestic Models - Major domestic models such as Tencent's Hunyuan, Baidu's Wenxin Yiyan, Alibaba's Tongyi App, and Zhiyun Qingyan exhibited fewer vulnerabilities, indicating a higher level of security [2] - Despite the lower number of vulnerabilities, the overall security of domestic foundational models still requires significant improvement, as indicated by a maximum score of only 77 out of 100 in security assessments [8] Emerging Risks with AI Agents - The transition from large models to AI agents introduces more complex risks, as AI agents inherit common security vulnerabilities while also presenting unique systemic risks due to their multi-modal capabilities [9][10] - Specific risks associated with AI agents include perception errors, decision-making mistakes, memory contamination, and potential misuse of tools and interfaces [10][11] Regulatory Developments - The National Market Supervision Administration has released 10 national standards and initiated 48 technical documents in areas such as multi-modal large models and AI agents, highlighting the need for standardized measures to mitigate risks associated with rapid technological advancements [11]
What's Going On With CrowdStrike Stock Tuesday? - CrowdStrike Holdings (NASDAQ:CRWD), Salesforce (NYSE:CRM)
Benzinga· 2025-09-16 13:50
Core Insights - CrowdStrike Holdings Inc. and Salesforce Inc. have announced a strategic partnership aimed at enhancing the security of AI agents and applications on the Salesforce Platform [1] - The collaboration integrates CrowdStrike's Falcon Shield with Salesforce Security Center, providing better visibility and compliance support for security teams [1][2] Integration and Functionality - The partnership allows enterprises to embed CrowdStrike's technology into Salesforce workflows, aligning security with business functions [2] - The joint offering helps track AI agents back to their human creators, detect abnormal behavior, and prevent exploitation of over-privileged accounts, addressing the growing risk of identity-based attacks [3] AI and Incident Management - CrowdStrike's Charlotte AI is integrated into Salesforce's Agentforce platform and Slack, enabling natural conversation for risk flagging and automated remediation [4] - Teams can manage incidents directly from the platform, including isolating compromised devices and blocking suspicious access [4] Executive Insights - Executives from both companies emphasized the importance of consolidating security insights for mission-critical workflows [5] - The partnership is positioned as essential for ensuring trust in AI-driven enterprises and enabling secure operations for future growth [5] Market Reaction - Following the announcement, CrowdStrike's shares experienced a decline of 1.65%, trading at $437.44 [6]
360联合云南电信发布跨境业务安全服务平台
Bei Jing Shang Bao· 2025-09-16 13:35
Core Viewpoint - The collaboration between 360 and China Telecom Yunnan Branch aims to enhance security in cross-border business through the launch of a "Cross-Border Business Security Service Platform" that integrates AI security systems with international communication resources [1] Group 1 - The platform provides comprehensive protection across the entire data lifecycle, including generation, transmission, storage, and application [1] - It addresses key issues in various sectors such as cross-border e-commerce, finance, and computing services, focusing on content review, AI fraud prevention, and data transmission security [1]
360胡振泉:共建跨境AI安全生态,联合云南电信筑牢数字丝路防线
Huan Qiu Wang· 2025-09-16 11:09
Core Insights - The current landscape of cross-border AI services has become a critical area for AI security governance, as highlighted by the collaboration between 360 Digital Security Group and China Telecom Yunnan Branch to launch a "Cross-Border Business Security Service Platform" aimed at ensuring the security of cross-border data flow [1][4] Group 1: AI Security Challenges - AI has transitioned from a potential risk to a real threat, with internal vulnerabilities such as programmability and the ability to generate false information, while external threats include state-level cyber warfare targeting AI systems [2] - In cross-border business scenarios, AI services must navigate complex issues including regional management requirements, security assessments, and content compliance, with content safety being deemed the "lifeline" of cross-border operations [2] Group 2: AI Security Framework - 360 has proposed a comprehensive AI security framework based on the "model governance" concept, integrating four key intelligent security agents: content safety, AI agent security, software security, and risk assessment, to achieve reliable and controllable AI governance [3] - The content safety agent monitors AI-generated content for false information and compliance, while the AI agent security agent protects against unauthorized access and operational risks [3] Group 3: Cross-Border Business Security Service Platform - The newly launched Cross-Border Business Security Service Platform combines 360's AI security technology with international communication resources from China Telecom, providing end-to-end protection for data generation, transmission, storage, and application [4] - This platform aims to address security challenges in sectors such as cross-border e-commerce, finance, and computing services, enhancing the safety of data transmission and preventing AI-related fraud [4]
将研制大模型量化评级体系
Nan Fang Du Shi Bao· 2025-09-15 23:10
Core Viewpoint - The establishment of the Guangdong-Hong Kong-Macao Greater Bay Area Generative Artificial Intelligence Safety Development Joint Laboratory aims to balance regulation and development through a multi-party collaborative mechanism, providing a localized AI safety development paradigm with international perspectives [2][10]. Group 1: AI Safety Risks - The most pressing issue in addressing AI safety risks in the Greater Bay Area is to scientifically, accurately, and efficiently assess and continuously enhance the credibility of large model outputs [4]. - Key challenges include reducing the degree of hallucination in AI models and ensuring compliance with legal, ethical, and regulatory standards [4]. Group 2: Resources and Advantages - The Joint Laboratory leverages a unique "resource puzzle" that includes government guidance, support from leading enterprises like Tencent, and research capabilities from universities like Sun Yat-sen University [4]. - This collaborative platform facilitates high-frequency interactions and rapid iterations to tackle challenges related to AI model hallucinations and compliance [4]. Group 3: AI Safety Assessment Framework - The laboratory plans to establish a comprehensive safety testing question bank and develop a security intelligence assessment engine for large models [5]. - The assessment framework will be based on principles of inclusive prudence, risk-oriented governance, and collaborative response, integrating technical protection with governance norms [5]. Group 4: Standardization and Regulation - The Joint Laboratory aims to create a localized safety standard system covering data security, content credibility, model transparency, and emergency response [6]. - Mandatory standards will be enforced in high-risk sectors like finance and healthcare, while innovative applications will be allowed to test and iterate in controlled environments [6]. Group 5: Talent Development - Universities in the Greater Bay Area are innovating talent cultivation models by integrating AI ethics, law, and governance into their curricula [8]. - Collaborative training bases with enterprises like Tencent are being established to provide students with practical experience in addressing real-world AI safety challenges [8]. Group 6: Future Expectations - The expectation is for the Joint Laboratory to become a national benchmark for AI safety assessment, promoting China's AI governance model internationally [9]. - The laboratory aims to create a sustainable and trustworthy ecosystem that not only assesses models but also drives model iteration and industry optimization [9].
探索跨区域安全协同治理“湾区方案”
Nan Fang Du Shi Bao· 2025-09-15 23:10
工业和信息化部电子第五研究所副所长、联合实验室专家王蕴辉 开篇语 生成式人工智能是引领新一代科技革命和产业革命的核心驱动力,是加快培育和发展新质生产力的重要 引擎,为经济高质量发展注入新动能,与此同时,各类难以预知的风险和挑战也伴生而来。 安全是发展的基石,为进一步创新筑牢根基。2025年9月15日,粤港澳大湾区生成式人工智能安全发展 联合实验室揭牌成立。其将构建"政产学研用"深度融合的创新生态,致力服务企业发展、推动产业落 地、加强安全监管,努力实现属地企业安全合规成本全国最低、安全能力水平全国领先,助力粤港澳大 湾区成为全国生成式人工智能安全发展服务最优区域。 南方都市报、南都大数据研究院推出"湾区AI安全发展新引擎"系列报道,深度对话参与联合实验室建设 的专家,一同憧憬大湾区AI安全发展新未来。 "联合实验室将带动大湾区形成完整的AI安全产业集群。"在工业和信息化部电子第五研究所副所长、联 合实验室专家王蕴辉看来,粤港澳大湾区生成式人工智能安全发展联合实验室(简称"联合实验室")将 发挥"安全基石、协同纽带、创新引擎"三重作用,实现"安全赋能发展"目标,助力生成式AI为大湾区高 质量发展注入新动能。 谈 ...