Group 1 - The core issue highlighted is the contamination of training data for artificial intelligence, which includes false information, fabricated content, and biased viewpoints, posing new challenges to AI safety [1][2] - High-quality data is essential for the accuracy and reliability of AI models, while contaminated data can distort AI's understanding, leading to erroneous decisions and potentially harmful outputs [1] - Research indicates that even a mere 0.01% of false text in training data can increase harmful content output by 11.2% [1] Group 2 - The integration of AI into various sectors, such as food recommendations, autonomous driving, financial decision-making, and medical diagnosis, underscores the importance of data integrity [1] - Misjudgments due to data pollution can trigger chain reactions resulting in significant losses, exemplified by traffic accidents in autonomous driving and abnormal stock price fluctuations in finance [1] - The current regulatory framework, including the "Interim Measures for the Management of Generative AI Services," aims to incorporate AI training data into oversight, with ongoing efforts to identify and mitigate the impact of malicious data [2]
莫让数据污染冲击人工智能安全
Jing Ji Ri Bao·2025-08-16 00:57