最高法法官:在大模型训练数据输入端构建合理使用制度
Nan Fang Du Shi Bao·2025-07-01 09:23

Core Viewpoint - The article discusses the legal implications of using copyrighted works as training data for AI models, advocating for a "wide entry, strict exit" approach to balance AI development and copyright protection [1][2][3]. Group 1: Legal Framework for AI Training Data - The author suggests establishing a reasonable use system for AI training data at the "input end" while implementing stricter regulations at the "output end" to protect the interests of copyright holders [1][2]. - The current risks associated with AI model applications are unclear, and imposing strict regulations at the input stage could hinder innovation due to high authorization costs and legal risks for AI developers [2][3]. - The author argues that traditional copyright licensing models may suppress innovation due to high costs and complex negotiations, leading to potential legal gray areas for AI companies [2][3]. Group 2: Legislative Recommendations - The author recommends legislative measures to classify AI training data as a specific case of reasonable use under copyright law, emphasizing its public interest and value in the AI industry [3]. - The use of training data by AI models is compared to "molecular gastronomy," where the data is not merely copied but transformed to extract underlying patterns [3]. - The proposal includes providing copyright holders with remedies for legal data acquisition and infringement risks, ensuring a dynamic balance between reasonable use and copyright protection [3]. Group 3: Judicial Precedents - Recent U.S. court rulings on AI training data have significant implications for China, highlighting the need for careful examination of whether the use of copyrighted works negatively impacts their market value [4]. - The rulings indicate that while some uses may be deemed reasonable, the legality of using copyrighted works for training AI models remains a complex issue that requires case-by-case analysis [4].