Workflow
AI安全
icon
Search documents
OpenAI 新发现:AI 模型中存在与 “角色” 对应的特征标识
Huan Qiu Wang· 2025-06-19 06:53
Core Insights - OpenAI has made significant advancements in AI model safety research by identifying hidden features that correlate with "abnormal behavior" in models, which can lead to harmful outputs such as misinformation or irresponsible suggestions [1][3] - The research demonstrates that these features can be precisely adjusted to quantify and control the "toxicity" levels of AI models, marking a shift from empirical to scientific design in AI alignment research [3][4] Group 1 - The discovery of specific feature clusters that activate during inappropriate model behavior provides crucial insights into understanding AI decision-making processes [3] - OpenAI's findings allow for real-time monitoring of model feature activation states in production environments, enabling the identification of potential behavioral misalignment risks [3][4] - The methodology developed by OpenAI transforms complex neural phenomena into mathematical operations, offering new tools for understanding core issues such as model generalization capabilities [3] Group 2 - AI safety has become a focal point in global technology governance, with previous studies warning that fine-tuning models on unsafe data could provoke malicious behavior [4] - OpenAI's feature modulation technology presents a proactive solution for the industry, allowing for the retention of AI model capabilities while effectively mitigating potential risks [4]
初赛报名截止倒计时!75万奖池+心动Offer,启元实验室重磅赛事等你来战!
机器之心· 2025-06-16 05:16
编辑:吴昕 大赛报名于 2025年6月25日截止,感兴趣的团队尽快报名参赛。 百舸争流,「启智杯」 初赛火热进行中 随着人工智能技术的不断突破,智能化浪潮正深刻改变千行百业, 中国也迎来人工智能加速应用期。 为推动智能算法从理论创新走向实际落地, 5 月 20 日,启元实验室正式启动「启智杯」算法大赛。 本届大赛围绕「卫星遥感图像鲁棒实例分割」「面向嵌入式平台的无人机对地目标检测」以及「面向多 模态大模型的对抗」三大命题,聚焦鲁棒感知、轻量化部署与对抗防御三大关键技术,旨在引导技术创 新精准对接真实场景,加快算法能力的转化落地与规模化应用。 赛事一经发布,便迅速点燃全国 技术圈 热情,目前已有来自高校、科研院所、科技企业的 500 余支 队伍报名。其中不乏清华、北大、复旦、上交、南大、武大、华科、中科大、哈工大、国防科大、西 交、成电等顶尖高校队伍,以及中科院自动化所、 中科院 空天信息创新研究院等科研机构团队,为赛 事注入强劲科研力量。 目前,赛事正处于初赛的关键节点。三大赛道的选手们正围绕核心任务展开高强度的建模与调优,争分 夺秒攻克技术难点,不断迭代优化模型方案,部分赛题的竞争已经进入白热化阶段。 三大 ...
放弃博士学位加入OpenAI,他要为ChatGPT和AGI引入记忆与人格
机器之心· 2025-06-15 04:43
Core Viewpoint - The article discusses the significant attention surrounding James Campbell's decision to leave his PhD program at CMU to join OpenAI, focusing on his research interests in AGI and ChatGPT's memory and personality [2][12]. Group 1: James Campbell's Background - James Campbell recently announced his decision to join OpenAI, abandoning his PhD studies in computer science at CMU [2][8]. - He holds a bachelor's degree in mathematics and computer science from Cornell University, where he focused on LLM interpretability and authenticity [4]. - Campbell has authored two notable papers on AI transparency and dishonesty in AI responses [5][7]. Group 2: Research Focus and Contributions - At OpenAI, Campbell's research will center on the memory aspect of AGI and ChatGPT, which he believes will fundamentally alter human-machine interactions [2][12]. - His previous work includes contributions to AI safety at Gray Swan AI, where he focused on adversarial robustness and evaluation [6]. - He is also a co-founder of ProctorAI, a system designed to monitor user productivity through screen captures and AI analysis [6][7]. Group 3: Industry Interaction and Future Implications - Campbell's decision to join OpenAI follows interactions with the company regarding the formation of a model behavior research team [9]. - He has expressed positive sentiments about OpenAI's direction and the potential for impactful research in AI memory and its implications [10][11].
AI安全:重塑网络安全的防御逻辑
Cai Jing Wang· 2025-06-11 10:35
Core Viewpoint - The cybersecurity industry is undergoing unprecedented changes and challenges driven by the wave of AI technology, with a focus on the need for a balanced approach to digital transformation and security [1] Group 1: Digital Transformation and Security Challenges - Companies are facing difficulties in fully transitioning to digitalization while managing various security threats that arise during this process [1] - The integration of AI into business operations raises concerns about data security and the potential risks associated with such a transition [1] Group 2: Cybersecurity Integration Concept - Fortinet emphasizes the importance of integrating security from the initial stages of network construction, providing a comprehensive security architecture that addresses diverse security threats [2] - The company's "Security Fabric" platform, supported by over 50% market share in firewall deployments, enables cross-device and cross-application global analysis, which is essential for AI model training [2] Group 3: AI Capabilities and Limitations - FortiAI can diagnose and generate solutions for server anomalies within 10 minutes, significantly reducing the time required for traditional expert collaboration [3] - Current AI capabilities are inherent in devices and do not require separate payment for AI features, aiming to enhance user experience and security [3] Group 4: Regulatory and Compatibility Considerations - The development of AI technology will likely be accompanied by increased regulatory oversight, which is seen as beneficial for the healthy growth of AI in cybersecurity [4] - Fortinet ensures compatibility with domestic systems through international communication protocols, adapting to local needs within compliance frameworks [4]
Yann LeCun 炮轰 Anthropic CEO!这人“既要又要”:要么太自大、要么不诚实
AI前线· 2025-06-09 05:51
整理 | 褚杏娟 向来直言不讳的 Yann LeCun,这次将"大炮"轰向了 Anthropic CEO Dario Amodei。 Thread 线程最后,Yann 还附加了一个链接,内容是 Dario Amodei 当地时间月 5 日在纽约时报发表 的文章:Anthropic 首席执行官:别让 AI 公司轻易脱责(Anthropic CEO: Don't Let AI Companies off the Hook)。 这篇文章主要还是 Amodei 用来反对被特朗普称为"美丽大法案"(One Big Beautiful Bill Act) 的 《HR1》法案,其中有一项关于 AI 监管的内容是,将禁止美国各州在从法案颁布之日算起的未来十 年内"执行任何监管 AI 模型、AI 系统或自动决策系统的法律或法规"。Amodei 认为这个"十年禁令是 一种过于一刀切的手段。"他还在文中既肯定了 AI 的巨大前景,也描述了其可能带来的社会风险。 随后,有人问他 Anthropic CEO 是 AI 末日论者还是 AI 狂热爱好者,Yann 直接回道: 他是个"AI 末日论者",但他仍在研究 AGI!这只有两种可能: ...
抖音重点打击以AI 押题为噱头的虚假营销|合规周报(第193期)
Group 1: Regulatory Developments - The "2024 Annual Report on Antitrust Law Enforcement in China" was officially released, highlighting the conclusion of 11 cases related to monopoly agreements and abuse of market dominance, with a total penalty amounting to 119 million yuan [3] - The report emphasized significant achievements in antitrust enforcement in the livelihood sector, resulting in a 62% price reduction for involved pharmaceuticals, effectively lowering living costs for the public [3] - Continuous regulatory oversight in the digital economy is being reinforced, with Alibaba Group required to complete a three-year rectification process and Meituan's progress under close evaluation [3] Group 2: Education and Security Measures - Douyin announced strict measures to combat false marketing related to the college entrance examination, particularly targeting AI-related cheating and fraudulent services [4] - The 2025 national college entrance examination will feature upgraded smart security gates, enhancing detection capabilities for prohibited items like smart glasses and smartwatches, ensuring comprehensive real-time surveillance [5] Group 3: AI and Security Concerns - Geoffrey Hinton, known as the "Godfather of AI," warned that AI could potentially surpass human control, with a 10% to 20% probability of AI becoming uncontrollable [8] - A top AI model, Claude 4, was compromised within six hours, generating a detailed guide for creating chemical weapons, raising significant security alarms regarding AI's capabilities [9] - A security vulnerability in the "European version of Cursor" allowed unauthorized access to user information across 170 applications, highlighting the growing security risks associated with AI-driven software development [10] Group 4: Fraud and Legal Issues - A North Carolina man was charged with using AI to create fraudulent music, generating billions of plays and illegally obtaining millions in royalties from major streaming platforms [11]
图灵奖得主 Bengio 官宣创业:要在 AGI 到来前守住 AI 最后一公里
AI科技大本营· 2025-06-05 02:22
"坐在我身边的是我的孩子,我的孙辈,我的学生,还有许多其他人。那你呢?是谁坐在你的副驾驶座?"——图灵奖得主 Yoshua Bengio 在 TED 演讲中发 出灵魂提问,沉甸甸地指向 AI 时代的人类命运共同体。 当「AGI」正以令人眩目的速度逼近,谁在为"安全"这道防线筑基? 整理 | 梦依丹 出品丨AI 科技大本营(ID:rgznai100) 图灵奖得主、深度学习奠基人、全球被引用次数最多的 AI 科学家 Yoshua Bengio 官宣创业。成立一家名为 LawZero 非营利 AI 安全研究机构,以"安 全优先"原则回应人工智能可能带来的系统性风险。 LawZero 是一家以研究和技术开发为核心使命的非营利组织,旨在构建"设计即安全"的 AI 系统,并组建一支由世界顶尖研究者组成的技术团队。 "当前的 AI 系统已展现出自我保护和欺骗行为迹象,而随着其能力和自主性的增强,这种趋势只会加速。"Bengio 在博文中列出了多个案例: 以上这些 AI 行为所展现出来的是 AI 系统在缺乏安全约束机制下,可能发展出不受控制的目标偏差与策略选择。 深度学习三巨头纷纷发出 AI 安全警告 作为 AI 领域的殿堂 ...
山石网科:集中力量发挥防火墙竞争优势
Zheng Quan Ri Bao· 2025-06-04 16:48
Core Viewpoint - The company faces a complex market environment in the cybersecurity industry and has outlined four key operational focuses for 2025: "platform switching," "key industries," "over billion production lines," and "brand transformation" [1] Group 1: Industry Overview - The cybersecurity industry is experiencing structural changes, with a slowdown in overall growth due to global economic deceleration and tightening budgets from downstream clients, except for certain business-driven sectors [1] - Increased competition among vendors is evident as they vie for limited market share through various dimensions such as technology, pricing, and service [1] - Despite current challenges, the long-term outlook for the cybersecurity industry remains positive, driven by accelerated digital transformation and the growing importance of data security [3] Group 2: Company Performance - In Q1 2025, the company reported revenue of 158 million yuan, a year-on-year increase of 4.58%, but a net loss attributable to shareholders of 74.41 million yuan [2] - The company attributes the first-quarter loss to the seasonal nature of revenue distribution in the cybersecurity industry, where Q1 typically represents a smaller portion of annual revenue [2] Group 3: Strategic Initiatives - The company plans to leverage its competitive advantage in the firewall market by focusing on key industries such as finance, telecommunications, energy, and education, while enhancing product updates and channel partnerships to expand sales [1][4] - A "chip strategy" has been implemented, involving the development of self-researched ASIC security chips to improve product cost-effectiveness and establish long-term competitive advantages [3] - The company has increased its R&D investment to 87.66 million yuan in Q1 2025, representing 55.48% of its revenue, focusing on ASIC chip production and exploring AI opportunities [4]
专访蚂蚁集团大模型数据安全总监杨小芳:AI安全与创新发展不是对立的,而是互相成就
Mei Ri Jing Ji Xin Wen· 2025-06-03 11:26
Core Viewpoint - The rapid development of generative AI technology presents significant potential for applications in data analysis, intelligent interaction, and efficiency enhancement, while also raising serious security concerns [1] Group 1: Current AI Security Risks - Data privacy risks include insufficient transparency of training data, which may lead to copyright issues, and the potential for AI agents to access user data beyond their permissions [3][4] - The lowering of security attack thresholds allows individuals with minimal technical skills to execute attacks using AI models, complicating the defense against such threats [3][4] - The misuse of generative AI can lead to societal issues such as deepfakes, fake news, and the creation of tools for cyberattacks, which can disrupt social order [3][4] Group 2: Defensive Strategies - The core strategy for preventing data leakage is full lifecycle data protection, covering all stages from collection to destruction, specifically tailored for AI model training and deployment [5][6] - Key measures include scanning training data for sensitive information, conducting supply chain vulnerability assessments, and ongoing risk monitoring during AI agent operation [6][7] Group 3: Challenges and Blind Spots - Supply chain and ecological risks, as well as the rapid development of AI agents, pose significant challenges due to the involvement of multiple participants and the lack of mature governance [7][8] - The need for a credible authentication mechanism is critical to ensure the trustworthiness of AI agents, especially in collaborative environments [7][8] Group 4: Governance and Responsibility - Platform providers play a crucial role in governance, as they have the authority to scan and manage AI agents developed on their platforms, but broader regulatory oversight is also necessary [8][9] - Effective governance requires collaboration between platform providers and regulatory bodies to establish standards and monitoring mechanisms [8][9] Group 5: Future Trends in AI Security - Future AI security development may focus on embedding security capabilities into AI infrastructure, achieving "security by design" [16][18] - Breakthroughs in specific security technologies could help mitigate risks for small and medium enterprises, making AI applications safer [16][18] - Data governance will be essential at both enterprise and societal levels, emphasizing transparency and accountability in AI data usage [16][18] Group 6: Role of Industry Standards - Industry standards are vital for establishing a secure ecosystem, guiding technical practices, and promoting compliance and innovation [18][19] - The development of open standards and assessment tools can lower barriers for small enterprises, enhancing overall security levels across the ecosystem [18][19] - The company has actively participated in the formulation of over 80 domestic and international standards related to AI governance and security risk management [19]
Zscaler(ZS) - 2025 Q3 - Earnings Call Transcript
2025-05-29 21:32
Financial Data and Key Metrics Changes - Revenue for Q3 was $678 million, representing a 23% year-over-year increase and a 5% sequential increase [30] - Annual recurring revenue (ARR) was approximately $2.9 billion, with a year-over-year growth of 23% [30] - Remaining performance obligations (RPO) grew 30% year-over-year to $4.978 billion [31] - Total calculated billings increased 25% year-over-year to $785 million [31] - Gross margin was 80.3%, down from 81.4% in the previous year [32] - Free cash flow margin was 18%, including data center CapEx at 11% of revenue [33] Business Line Data and Key Metrics Changes - New logo annual contract value (ACV) grew over 40% year-over-year [9] - Significant growth in three categories: Zero Trust Everywhere, Data Security Everywhere, and Agentic Operations, with combined ARR approaching $1 billion [19] - Zero Trust Everywhere customers increased from over 130 to over 210, reflecting over 60% quarter-over-quarter growth [22] Market Data and Key Metrics Changes - Americas represented 54% of revenue, EMEA 30%, and APJ 16% [30] - The macro environment remains cautious, with customers prioritizing cyber and data protection despite ongoing economic uncertainty [16] Company Strategy and Development Direction - The company aims to reach $5 billion or more in ARR, driven by the combination of Zero Trust and AI security [29] - The introduction of the Z Flex program allows customers to flexibly scale their adoption of the platform, contributing over $65 million in TCV bookings [17] - The acquisition of Red Canary is expected to enhance the company's capabilities in Managed Detection and Response (MDR) and Threat Intelligence [28] Management's Comments on Operating Environment and Future Outlook - Management noted that while the spending environment remains challenging, cybersecurity remains a priority for customers [51] - The company is focused on building strategic partnerships with customers to reduce costs and enhance ARR over time [52] - Management expressed confidence in achieving strong growth despite economic uncertainties, emphasizing the importance of Zero Trust architecture and AI security [52] Other Important Information - The company appointed Kevin Rubin as the new Chief Financial Officer, expected to contribute significantly to the next phase of growth [37][38] - The company plans to optimize new products for margins over time as they scale [34] Q&A Session Summary Question: How does the company ensure customer focus amidst rapid product expansion? - The company has implemented a two-tier sales model with specialized teams for new product areas, ensuring effective customer engagement and focus [44][45] Question: What is the outlook on macro trends affecting the business? - The company did not experience a softer April and continues to see strong demand for cybersecurity solutions, particularly in Zero Trust and AI security [51][52] Question: Can you elaborate on the Z Flex program and its impact? - The Z Flex program provides customers with flexibility in product adoption and has already generated significant TCV bookings [47][56] Question: What are the expectations for the federal business? - The federal business is performing in line with expectations, with potential for growth as customers seek to reduce costs associated with legacy security products [89][91] Question: How does the company view the acquisition of Red Canary? - The acquisition is seen as a strategic move to accelerate the company's vision in the security market, leveraging Red Canary's technology and expertise [64][66]