Workflow
版权保护
icon
Search documents
2024年北京版权十件大事发布
转自:北京日报客户端 在第25个世界知识产权日即将到来之际,为展示2024年度首都版权工作成果,提升全社会版权意识,推 动版权产业高质量发展,4月14日,北京市出版版权协会发布"2024年北京版权十件大事"。 一、首届"京津冀版权协同发展论坛"成功举办。2024年4月,首届"京津冀版权协同发展论坛"在京举 行,京津冀三地文化执法部门、行业协会分别签署合作协议,在版权执法、版权宣传、社会服务、人才 培养、走进高校等方面深化区域协同联动,构建版权行政司法保护合作机制。 六、北京办理全国首例版权领域非现场执法案件。2024年12月,北京市文化市场综合执法总队执法人员 在非现场监管巡查中发现某公司涉嫌存在侵权使用版权图片非法代发文章的违法侵权行为。执法人员依 法立案调查后紧急叫停违法行为,依托总队"智慧文化执法"平台,综合运用"动态风险监测+区块链存证 固证"等非现场监管手段,使用区块链技术解决版权取证存证问题,依托"京通"小程序实现了执法文书 远程签字确认,实现全程无接触执法。 七、市检察院牵头相关单位联合印发《知识产权涉刑重点线索联合研判工作指引》。2024年,为着重解 决大数据法律监督模型筛查线索核查难等问题, ...
速递|O'Reilly指控OpenAI"窃书" 训练 GPT-4o,AI数据黑箱再陷版权风暴
Z Potentials· 2025-04-02 03:17
Core Viewpoint - OpenAI is accused of potentially using copyrighted content from O'Reilly Media's paywalled books to train its AI models without authorization, raising concerns about copyright infringement and data sourcing practices [1][2][5]. Group 1: Allegations and Findings - A new paper from the AI Disclosure Project suggests that OpenAI likely used O'Reilly Media's paywalled books to train its GPT-4o model, with no licensing agreement in place between the two entities [2]. - The paper indicates that GPT-4o shows a significantly higher recognition ability for O'Reilly's paywalled content compared to the earlier GPT-3.5 Turbo model [2][3]. - Researchers analyzed 13,962 excerpts from 34 O'Reilly books and found that GPT-4o's recognition rate for paywalled content was notably higher than that of older OpenAI models [3]. Group 2: Methodology and Limitations - The study employed a method called DE-COP, designed to detect copyrighted content in language model training data, which involves testing the model's ability to distinguish between human-written text and AI-generated rewrites [2][3]. - The authors acknowledge that their findings do not constitute definitive proof, as OpenAI could have obtained excerpts through user interactions with ChatGPT [4]. - The research did not evaluate OpenAI's latest models, such as GPT-4.5 and reasoning models, which may not have been trained on paywalled O'Reilly content [4]. Group 3: Industry Context and Practices - OpenAI has been known to advocate for relaxed restrictions on using copyrighted data for model development and has sought higher quality training data [4]. - The company has entered into licensing agreements with various publishers and social networks for some of its training data, and it provides a mechanism for copyright holders to opt-out of having their content used [4].