人工智能安全
Search documents
我们让GPT玩狼人杀,它特别喜欢杀0号和1号,为什么?
Hu Xiu· 2025-05-23 05:32
Core Viewpoint - The discussion highlights the potential dangers and challenges posed by AI, emphasizing the need for awareness and proactive measures in addressing AI safety issues. Group 1: AI Safety Concerns - AI has inherent issues such as hallucinations and biases, which require serious consideration despite the perception that the risks are distant [10][11]. - The phenomenon of adversarial examples poses significant risks, where slight alterations to inputs can lead AI to make dangerous decisions, such as misinterpreting traffic signs [17][37]. - The existence of adversarial examples is acknowledged, and while they are a concern, many AI applications implement robust detection mechanisms to mitigate risks [38]. Group 2: AI Bias - AI bias is a prevalent issue, illustrated by incidents where AI mislabels individuals based on race or gender, leading to significant social implications [40][45]. - The root causes of AI bias include overconfidence in model predictions and the influence of training data, which often reflects societal biases [64][72]. - Efforts to mitigate bias through data manipulation have limited effectiveness, as inherent societal structures and language usage continue to influence AI outcomes [90][91]. Group 3: Algorithmic Limitations - AI algorithms primarily learn correlations rather than causal relationships, which can lead to flawed decision-making [93][94]. - The reliance on training data that lacks comprehensive representation can exacerbate biases and inaccuracies in AI outputs [132]. Group 4: Future Directions - The concept of value alignment is crucial as AI systems become more advanced, necessitating a deeper understanding of human values to ensure AI actions align with societal norms [128][129]. - Research into scalable oversight and superalignment is ongoing, aiming to develop frameworks that enhance AI's compatibility with human values [130][134]. - The importance of AI safety is increasingly recognized, with initiatives being established to integrate AI safety into public policy discussions [137][139].
AI开始失控了吗?100名科学家联手发布全球首个AI安全共识
3 6 Ke· 2025-05-13 09:55
Core Viewpoint - The discussion around the risks and dangers of artificial intelligence (AI) emphasizes the importance of actions taken by AI researchers themselves, alongside government interventions [1] Group 1: Guidelines and Consensus - Over 100 scientists gathered in Singapore to propose guidelines for making AI more trustworthy, reliable, and safe [1] - The guidelines were released in a document titled "Singapore Consensus on Global AI Safety Research Priorities" during a major AI conference, marking the first large-scale AI event in Asia [1] - Notable contributors to the consensus include prominent figures from institutions like MILA, UC Berkeley, and MIT, highlighting a collaborative effort in AI safety [1] Group 2: Importance of Guidelines - Josephine Teo, Singapore's Minister for Digital Development and Information, emphasized that citizens cannot vote on the type of AI they want, indicating a lack of public agency in shaping AI development [2] - The need for guidelines is underscored by the fact that citizens will face the opportunities and challenges posed by AI without having a say in its trajectory [2] Group 3: Risk Assessment - The consensus outlines three categories for researchers: identifying risks, constructing AI systems to avoid risks, and maintaining control over AI systems [4] - The authors advocate for developing "metrics" to quantify potential harms and conducting quantitative risk assessments to reduce uncertainty [4] - There is a call for external parties to monitor AI development while balancing the protection of intellectual property [4] Group 4: Design and Control - The design aspect focuses on creating trustworthy AI through technical methods that specify AI program intentions and outline undesirable outcomes [5] - Researchers are encouraged to enhance training methods to ensure AI programs meet specifications, particularly in reducing hallucinations and improving robustness against malicious prompts [5] - The control section discusses expanding current computer security measures and developing new technologies to prevent AI from going out of control [7] - The urgency for increased investment in safety research is highlighted, as current scientific understanding does not fully address all risks associated with AI [7]
刘宁会见奇安信集团董事长齐向东
He Nan Ri Bao· 2025-05-09 10:39
Group 1 - The meeting between the Secretary of the Provincial Party Committee Liu Ning and Qi Anxin Technology Group's Chairman Qi Xiangdong highlighted the importance of network security and the support for the development of private enterprises in Henan [1][2] - Henan is focusing on developing the new generation of information technology industry, integrating digital economy with the real economy, and enhancing network security to support high-quality economic development [1] - Qi Anxin Group aims to strengthen its presence in Henan by leveraging its technological, service, and talent advantages to contribute to the construction of a digital strong province and enhance network security [2] Group 2 - The provincial leadership expressed their commitment to providing a favorable environment for enterprises to operate and innovate in Henan [1] - Qi Anxin Group plans to deepen cooperation in areas such as artificial intelligence security, data resource integration, and talent cultivation to bolster network security in the region [1][2]
瑞莱智慧CEO:大模型形成强生产力关键在把智能体组织起来,安全可控是核心前置门槛 | 中国AIGC产业峰会
量子位· 2025-05-06 09:08
Core Viewpoint - The security and controllability of large models are becoming prerequisites for industrial implementation, especially in critical sectors like finance and healthcare, which demand higher standards for data privacy, model behavior, and ethical compliance [1][6]. Group 1: AI Security Issues - Numerous security issues have emerged during the implementation of AI, necessitating urgent solutions. These include risks of model misuse and the need for robust AI detection systems as the realism of AIGC technology increases [6][8]. - Examples of security vulnerabilities include the "grandma loophole" in ChatGPT, where users manipulated the model to disclose sensitive information, highlighting the risks of data leakage and misinformation [8][9]. - The potential for AI-generated content to be used for malicious purposes, such as creating fake videos to mislead the public or facilitate scams, poses significant challenges [9][10]. Group 2: Stages of AI Security Implementation - The implementation of AI security can be divided into three stages: enhancing the reliability and safety of AI itself, preventing misuse of AI capabilities, and ensuring the safe development of AGI [11][12]. - The first stage focuses on fortifying AI against vulnerabilities like model jailbreaks and value misalignment, while the second stage addresses the risks of AI being weaponized for fraud and misinformation [12][13]. Group 3: Practical Solutions and Products - The company has developed various platforms and products aimed at enhancing AI security, including AI safety and application platforms, AIGC detection platforms, and a super alignment platform for AGI safety [13][14]. - A notable product is the RealGuard facial recognition firewall, which acts as a preemptive measure to identify and reject potential attack samples before they reach the recognition stage, ensuring greater security for financial applications [16][17]. - The company has also introduced a generative AI content monitoring platform, DeepReal, which utilizes AI to detect and differentiate between real and fake content across various media formats [19][20]. Group 4: Safe Implementation of Vertical Large Models - The successful deployment of vertical large models requires prioritizing safety, with a structured approach to model implementation that includes initial Q&A workflows, work assistance flows, and deep task reconstruction for human-AI collaboration [21][22]. - Key considerations for enhancing the safety of large models include improving model security capabilities, providing risk alerts for harmful outputs, and reinforcing training and inference layers [22][23]. Group 5: Future Perspectives on AI Development - The evolution of AI capabilities does not inherently lead to increased safety; proactive research and strategic planning for security are essential as AI models become more advanced [24][25]. - The organization of intelligent agents and their integration into workflows is crucial for maximizing AI productivity and ensuring that safety remains a fundamental prerequisite for the deployment of AI technologies [25][26].
尼山话“安全” 专家建言利用安全大模型解决AI幻觉等问题
Zhong Guo Xin Wen Wang· 2025-04-14 11:10
本次活动由山东省国家安全厅主办,旨在通过科技安全教育,加深民众对国家安全的认识理解,探索新 时代国家安全教育的创新实践。来自科技企业、高校及科研院所等机构的多位专家学者参会,通过主题 演讲、圆桌对话等形式,围绕"科技安全"议题展开研讨。 日,"新时代 新科技 新安全"第十个全民国家安全教育日暨科技安全主题活动举行。山东省国家安全厅 供图 其中,360集团创始人周鸿祎以《数字安全网络战与AI带来的安全问题》为题进行主题演讲。他认为, 数字化发展越快,安全挑战越大,网络攻击呈现国家机器化和专业集团化特点。 中新网北京4月14日电 (记者 张素)今年4月15日是第10个"全民国家安全教育日"。近日,"新时代 新科技 新安全"第十个全民国家安全教育日暨科技安全主题活动在尼山讲堂举行。 4月10 本次活动设有企业家圆桌对话环节。与会企业家表示,科技安全是新时代的"万里长城",企业家是夯土 筑墙的"工匠"。安全不仅在于技术掌控,更在于凝聚人心,要激发人性中的"大我"精神,进而实现技术 突破与产业创新的真正跃迁。 还有与会专家认为,中华优秀传统文化蕴含着丰富的智慧与价值观,可以助力培养战略科学家,滋养科 技工作者的内心,让 ...
Rime创投日报:上海1000亿基金正式启动-2025-03-26
Lai Mi Yan Jiu Yuan· 2025-03-26 07:07
Report Summary 1. Investment Events - On March 25, 21 investment and financing events were disclosed in domestic and foreign venture capital markets, including 13 domestic and 8 foreign enterprises, with a total financing of about 8.088 billion yuan [1] - On March 24, the first phase of the "Hangtou Zhengling Shuangdongjian Low - altitude Economy Industry Investment Fund" jointly initiated by Chengdu Eastern New Area, Shuangliu District, and Jianyang City was in place, targeting the low - altitude economy and focusing on eVTOL整机 enterprises [1] - On March 25, Qi'an Investment completed a new - phase fund with a first - closing scale of 300 million yuan, investing in network, data, AI, national defense, and quantum security technologies [2][3] - On March 25, Shanghai launched the second - phase industrial transformation and upgrading fund (total scale of 5 billion yuan) and the state - owned asset M&A fund matrix (total scale over 5 billion yuan) [4] 2. Large - scale Financing - On March 25, Xinghang Internet completed a several - hundred - million - yuan Series A financing for R & D and global network expansion in aviation satellite communication [5] - On March 25, ELU.AI completed a several - hundred - million - yuan Pre - A financing for strengthening AI decision - making systems and internationalization [6] - On March 25, Changjin Photonics completed a strategic financing of over 100 million yuan for technology optimization and production capacity expansion [7] 3. Global IPOs - On March 25, Shengke Nano listed on the Shanghai Stock Exchange Science and Technology Innovation Board, providing semiconductor testing services [8] - On March 25, Nanshan Aluminum International listed on the Hong Kong Stock Exchange Main Board, planning to expand alumina production capacity from 2 million tons to 4 million tons [9] 4. Policy Focus - On March 24, Guangdong issued a three - year transportation development plan, aiming to build a modern transportation network by 2027 and achieve specific traffic and logistics circles [10][11]
速递|李飞飞团队发布41页AI监管报告,称全球AI安全法规应预判未来风险
Z Potentials· 2025-03-20 02:56
Core Viewpoint - The report emphasizes the need for lawmakers to consider previously unobserved risks associated with artificial intelligence (AI) when developing regulatory policies, advocating for increased transparency from AI developers [1][2]. Group 1: Legislative Recommendations - The report suggests that legislation should enhance transparency regarding the content developed by leading AI labs like OpenAI, requiring developers to disclose safety testing, data acquisition practices, and security measures [2]. - It advocates for improved standards for third-party evaluations of these metrics and protections for whistleblowers within AI companies [2][3]. - A dual approach is recommended to increase transparency in AI model development, promoting a "trust but verify" strategy [3]. Group 2: Risk Assessment - The report highlights that while there is currently insufficient evidence regarding AI's potential to assist in cyberattacks or create biological weapons, policies should anticipate future risks that may arise without adequate protective measures [2]. - It draws parallels to the predictability of nuclear weapon destruction, suggesting that the costs of inaction in the AI sector could be extremely high if extreme risks materialize [3]. Group 3: Reception and Context - The report has received broad praise from experts on both sides of the AI policy debate, indicating a hopeful advancement for AI safety regulation in California [4]. - It aligns with key points from previous legislative efforts, such as the SB 1047 bill, which aimed to require AI developers to report safety testing results [4].