Workflow
人工智能安全
icon
Search documents
前瞻人工智能安全评估体系与基座平台入列北京人工智能前沿成果
Xin Jing Bao· 2026-01-05 04:37
前瞻院通过测试发现了一个关键问题:近年来的大模型在安全性方面并未明显提升,部分新近模型的攻 击成功率甚至更高。但同时前瞻院也验证了一个重要结论:人工智能模型防御并不必然需要重新训练模 型,在在线推理阶段通过部署安全护栏,就能显著提升模型安全水平。针对前瞻安全基准中的各类风 险,前瞻院打造了"前瞻灵御"AI安全攻防平台,可为企业提供标准化评估流程、全面系统的安全分析, 帮助企业准确识别模型安全水平,并提供针对性的防御方案与加固建议。 此外,前瞻院还构建了"前瞻灵度"AI伦理评估平台,专注于AI伦理的智能评估与价值校准。平台能够对 上百个大模型进行实时动态监控与并行测试,评估其在六大维度、90个细分类别的伦理合乎度。其集成 了以中国价值观为核心的大规模中文价值语料库,覆盖3个层面、12个核心价值、50个衍生价值,累计 25万余条规则;收录了全球200余项伦理原则与规范,以及40余项中英文法律法规与国际公约,提供精 准的合规指引。可以对人工智能、数据安全、神经科学、脑机接口、医疗健康、生物安全、危险化学物 质、核物质、自动驾驶等领域进行自动化伦理评估辅助。 前瞻院认为,人工智能安全应成为"第一性原理",是不可删除、 ...
影响市场重大事件:社保基金会表态,充分发挥长期资金、耐心资本作用,更好支持科技创新;中国信通院建设的人工智能产品安全漏洞专业库(CAIVD)正式上线运行
Mei Ri Jing Ji Xin Wen· 2025-12-16 22:37
Group 1 - The Social Security Fund emphasizes the importance of long-term capital to support technological and industrial innovation while ensuring safe investment operations [1] - The fund aims to align its strategies with the "14th Five-Year Plan" and the central economic work conference's key tasks to promote high-quality development [1] Group 2 - The China Academy of Information and Communications Technology has launched the AI Product Security Vulnerability Database (CAIVD) to manage and verify security vulnerabilities in AI products [2] - This initiative aims to encourage AI product providers to promptly address security issues and foster a secure industrial ecosystem [2] Group 3 - Douyin has introduced a financial industry governance convention that prohibits non-certified accounts from publishing financial professional content [3] - The convention aims to regulate the dissemination of financial-related content and establish clear governance boundaries [3] Group 4 - Guangzhou's government has released a planning outline for the construction of a new power system, promoting the integration of electric vehicles and energy storage solutions [4] - The plan encourages the demonstration and large-scale application of various new energy storage projects [4] Group 5 - Counterpoint Research reports a projected 2.1% decline in global smartphone shipments next year due to memory shortages and rising costs [5] - The average selling price of smartphones is expected to increase by 6.9% as a result of overall component cost increases of 10% to 25% [5] Group 6 - The European Union plans to abandon the 2035 ban on internal combustion engines, allowing certain hybrid and fuel-extended electric vehicles to be sold [6] - The new proposal aims for a 90% reduction in emissions by the mid-2030s, compared to the original goal of 100% [6] Group 7 - Kunlun Core is nearing the completion of its share reform and is accelerating its efforts to go public, targeting over 2 billion yuan in revenue by 2025 [7][8] - The company plans to list on the Hong Kong stock exchange and is currently evaluating its potential spin-off [7][8] Group 8 - The Shanghai Market Supervision Bureau held a roundtable to understand the development of foreign-invested enterprises and improve the business environment [9] - The bureau aims to streamline the registration process for foreign enterprises to enhance their experience and satisfaction [9] Group 9 - The State Administration for Market Regulation is set to introduce new regulations for live e-commerce and food safety to ensure orderly development [10] - Upcoming regulations will include measures for live e-commerce supervision and food safety responsibilities for chain enterprises [10] Group 10 - China Energy Construction announced the launch of the world's largest integrated green hydrogen and ammonia project, with an investment of 6.946 billion yuan [11] - The project aims to produce 45,000 tons of green hydrogen and 200,000 tons of green ammonia annually, significantly reducing coal consumption and CO2 emissions [11]
NeurIPS 2025|指哪打哪,可控对抗样本生成器来了!
机器之心· 2025-12-15 08:10
Core Viewpoint - The article discusses the introduction of a novel adversarial attack generation framework called Dual-Flow, developed by Tsinghua University and Ant Group, which can generate effective adversarial samples without relying on the target model structure or gradients, posing significant challenges to AI security [2][5]. Group 1: Dual-Flow Framework - Dual-Flow learns "universal perturbation patterns" from vast image datasets, enabling it to launch black-box attacks across various models and categories [2][5]. - The framework employs a "forward perturbation modeling - conditional backward optimization" dual-flow structure, achieving high transferability and success rates for adversarial samples while maintaining low visual differences [2][5][8]. - It acts as a "controllable adversarial sample generator," allowing users to specify target image categories for automatic generation of realistic and effective attack images [2][5]. Group 2: Limitations of Traditional Methods - Traditional methods face two major limitations: instance-specific attacks, which have high success rates but are limited to single images and lack transferability [6], and instance-agnostic attacks, which have limited transferability and lower success rates when targeting multiple models or categories [7][8]. Group 3: Innovations of Dual-Flow - The core innovation of Dual-Flow lies in its forward and backward flow structure, which generates more natural, concealed, and structured perturbations compared to traditional pixel-level noise methods, while maintaining high transferability [9][22]. - Dual-Flow's unified framework supports multi-target and instance-agnostic attack capabilities, allowing a single generator to cover multiple categories and models, significantly reducing costs and enhancing practicality [10][22]. Group 4: Experimental Results - Experimental results on the ImageNet NeurIPS validation set indicate that Dual-Flow demonstrates strong transferability in both single-target and multi-target attacks, with average success rates significantly higher than traditional methods in black-box environments [17][18]. - Even against adversarially trained models, Dual-Flow maintains high success rates, showcasing its generality and powerful attack capabilities in real-world scenarios [19][22]. - The technology has been integrated into Ant Group's identity security products, optimizing capabilities for adversarial sample generation and detection, thereby enhancing the robustness of defense systems against adversarial samples [24].
Anthropic嘲讽奥特曼:我们从不玩 “红色警报”,CEO放话:Claude更赚钱,流量仅GPT 1%敢冲3500亿IPO?
3 6 Ke· 2025-12-04 09:05
Core Insights - Anthropic, the maker of the Claude chatbot, is preparing for an IPO with a potential valuation exceeding $300 billion, aiming for a launch as early as next year [1] - The company is currently pursuing a private funding round with a target valuation of $350 billion, while also negotiating with major investment banks [1] - Anthropic's CEO, Dario Amodei, has expressed skepticism about competitors' strategies, particularly OpenAI's aggressive funding approach [2][11] Financial Performance - Anthropic has experienced a tenfold revenue growth annually over the past three years, projecting revenues to reach between $8 billion and $10 billion by the end of 2024 [4] - The company expects to generate approximately $26 billion in annual revenue by next year, with over 300,000 enterprise clients [3] - By 2028, Anthropic's sales could potentially reach $70 billion, indicating a proposed valuation of five times its sales [4] Competitive Landscape - Anthropic's focus is on enterprise applications rather than consumer markets, which differentiates it from competitors like OpenAI and Google [14] - The company has captured a 32% share of the enterprise AI market, indicating strong demand from business clients [14] - OpenAI is also preparing for a potential IPO, with a current valuation of $500 billion, which is expected to be five times its projected sales by 2028 [13] Strategic Direction - Anthropic aims to create a safer alternative to OpenAI, emphasizing the development of "useful, honest, and harmless" AI [4] - The company is expanding its focus to various industries, including finance, healthcare, retail, and energy, while optimizing its models for enterprise needs [8] - Amodei has highlighted the importance of maintaining a competitive edge in model development while managing the risks associated with resource procurement [11][12] Societal Impact - Amodei has raised concerns about "technological unemployment," suggesting that up to half of entry-level jobs could disappear due to AI advancements [17] - The company is exploring strategies to mitigate job losses, advocating for a balance between efficiency gains and job creation [18] - Amodei envisions a future where work may not dominate people's lives, suggesting a need for societal adaptation in the post-AGI era [19]
Anthropic嘲讽奥特曼:我们从不玩 “红色警报”!CEO放话:Claude更赚钱!流量仅GPT 1%敢冲3500亿IPO?
AI前线· 2025-12-04 07:22
Core Viewpoint - Anthropic, the maker of the Claude chatbot, is preparing for an IPO with a potential valuation exceeding $300 billion, aiming to capitalize on the booming AI industry and compete with rivals like OpenAI [2][5]. IPO Preparation - Anthropic has engaged Wilson Sonsini, a law firm experienced in tech IPOs, to assist with its public offering, which could occur as early as next year [2]. - The company is also pursuing a private funding round with a target valuation of $350 billion, indicating strong investor interest [2][5]. - Anthropic's IPO could be one of the largest in history, with the company only five years old at the time of the offering [5]. Revenue Growth - CEO Dario Amodei reported that Anthropic's revenue has grown tenfold annually over the past three years, projecting a rise from $1 billion in 2023 to between $80 billion and $100 billion by the end of 2024 [6]. - The company expects to serve over 300,000 enterprise clients, with annual revenue projected to exceed $26 billion [5][6]. - Anthropic's subscription revenue has surged nearly sevenfold this year, contrasting with OpenAI's slower growth rate of 18% [19]. Competitive Landscape - Anthropic aims to differentiate itself from OpenAI by focusing on enterprise applications rather than consumer markets, capturing a 32% share in the enterprise AI market [18]. - The company emphasizes a responsible approach to AI development, contrasting with competitors' aggressive funding strategies [14][18]. - Amodei criticized OpenAI's management style and spending habits, suggesting that Anthropic's focus on enterprise needs provides a competitive edge [10][14]. Addressing Job Displacement - Amodei highlighted the potential for significant job losses due to AI, estimating that half of entry-level jobs could disappear [21]. - The company advocates for a balanced approach where AI enhances productivity without solely replacing human jobs, encouraging businesses to create new value through AI [21][22]. - Amodei proposed a multi-layered strategy involving private sector initiatives, government collaboration, and societal restructuring to address the challenges posed by AI-induced job displacement [22][23][24].
研究称 OpenAI、xAI 等全球主要 AI 公司安全措施“不及格”,远未达全球标准
Xin Lang Cai Jing· 2025-12-03 13:21
IT之家 12 月 3 日消息,据路透社报道,"未来生命研究所"今天发布了最新 AI 安全指数,指向 Anthropic、OpenAI、xAI 和 Meta 等主要 AI 公司的安全措 施"远未达到新兴的全球标准"。 机构指出,独立专家的评估显示,各企业一心追逐超级智能,却没有建立能真正管控这一类高阶系统的可靠方案。 Mor World V Business V Markets ∨ Sustainability V Al companies' safety practices fail meet global standards, study show By Reuters December 3, 2025 7:18 PM GMT+8 · Updated 22 mins ago Dec 3 (Reuters) - The safety practices of major artificial intelligence companies, such as A OpenAl, xAI and Meta, are "far short of emerging global standards," accordi ...
安恒信息与海光信息签署算力安全战略合作
Ju Chao Zi Xun· 2025-12-03 10:12
根据合作安排,双方将依托国产算力平台,围绕人工智能安全、重点行业场景创新、数据要素开发与流通等领域开展产品适配与联合研发,共同打造 覆盖基础设施、安全产品与行业应用的一体化解决方案。同时,双方还计划在安全能力输出、服务体系建设等方面探索协同模式。 在人才与生态层面,安恒信息与海光信息将通过联合实验室、联合培训等方式,推进安全人才培养和技术经验沉淀,并向上下游合作伙伴开放算力与 安全技术能力,推动形成更为完善的国产算力与安全产业生态。 从应用前景看,双方的合作有望在政务云、金融信创、能源安全生产、工业互联网等场景中落地,为用户提供更安全、高效、智能的数字基础设施, 提升关键行业在自主可控算力和安全防护方面的整体能力。随着项目推进,相关成果也将逐步通过产品化和解决方案形式推向市场。 (校对/秋贤) (文/罗叶馨梅)12月3日,安恒信息(688023.SH)发布消息称,公司与海光信息技术股份有限公司(以下简称"海光信息")正式签署战略合作协议, 确立在技术创新、生态构建和资源共享等多领域长期稳定的合作关系。双方将立足"芯片+应用""算力+安全"的双轮驱动模式,建立长期协同机制。 安恒信息是国内网络安全和数据安全领域 ...
人类没有对抗AI的“终极武器”?美国兰德公司:断网、断电、“以AI治AI”都风险巨大
美股IPO· 2025-11-25 03:40
兰德公司警告,人类尚无应对全球性AI失控的可靠"终极武器"。其评估的三种极端方案——"用核爆瘫痪全球电网"、全球断网及"以AI治AI",均因附带 损害巨大、效果不确定且可能引发灾难而不可行。报告强调,预防远胜于补救,AI安全必须前置。 据追风交易台消息,美国顶级智库兰德公司最新发布了一份极具前瞻性的报告,探讨了在面临灾难性"流氓AI"(Rogue AI)威胁时,人类可采取的三种 全球性技术反制手段。这些手段包括: 高空电磁脉冲(HEMP)攻击、全球互联网关停,以及用"工具AI"对付"流氓AI" 。 然而,报告的结论令人警醒—— 目前没有任何一种技术手段能够可靠、有效地应对全球性失控AI危机。 每一种方案都伴随着巨大的不确定性、毁灭性 的附带损害和极高的执行门槛,甚至可能引发核报复。全球互联网的冗余和分布式特性使其极难被完全关闭,任何尝试都将重创全球经济。而部署专门 的工具AI来对抗流-氓AI,本身就存在失控或被反制的风险。 对于投资者和市场而言,这份报告的意义在于,它揭示了AI技术潜在的系统性风险缺乏有效"保险丝"。报告强调,由于缺乏可靠的技术反制措施,预防 AI失控的重要性被提到了前所未有的高度。这意味着A ...
十大典型案例——360:“以模制模”解决人工智能安全问题
Jing Ji Ri Bao· 2025-11-09 05:49
Core Viewpoint - The company focuses on creating a "Model Safety Guardian" based on the "molded model" concept, addressing the issues of AI reliability, trustworthiness, controllability, and benevolence [1] Group 1 - The solution aims to help enterprises strengthen their defenses against large model security threats [1] - It employs standardized and automated evaluation processes, utilizing a rich dataset and security evaluation models for multi-dimensional inspection of business model outputs [1] - The system features dual protection on both input and output sides, enabling "plug-and-play" security enhancements [1] Group 2 - Post-incident, the solution offers flexible configuration options such as intervention Q&A databases and sensitive word libraries for protective engines [1]
英国国王,交给黄仁勋两样东西
Xin Lang Cai Jing· 2025-11-06 08:23
Group 1 - Nvidia CEO Jensen Huang received the Elizabeth Queen Engineering Award from King Charles, highlighting the importance of AI safety in his speech [1][3] - King Charles emphasized the need for urgency, unity, and collective effort to address AI risks, comparing advancements in AI to the discovery of electricity [3][4] - The 2023 Elizabeth Queen Engineering Award recognized significant contributions in modern machine learning, with winners including prominent figures in AI such as Geoffrey Hinton and Yoshua Bengio [4][5] Group 2 - Huang noted that the UK is in a favorable position to seize opportunities in the ongoing industrial revolution, with significant investments from major tech companies like Nvidia in AI infrastructure [5] - Nvidia and other US tech giants are investing billions in building AI infrastructure in the UK, referred to as "AI factories" by Huang [5]