RAG tool
Search documents
The dictionary sues OpenAI
TechCrunch· 2026-03-16 17:38
Core Argument - Encyclopedia Britannica and Merriam-Webster have filed a lawsuit against OpenAI for "massive copyright infringement" related to the unauthorized use of nearly 100,000 online articles to train its language models [1][2] Group 1: Copyright Infringement Allegations - Britannica claims OpenAI generates outputs that include "full or partial verbatim reproductions" of its content, violating copyright laws [2] - The lawsuit also alleges that OpenAI's use of Britannica's articles in ChatGPT's retrieval augmented generation (RAG) workflow constitutes copyright infringement [2] - Britannica accuses OpenAI of violating the Lanham Act by generating false information attributed to the publisher, which could mislead users [2] Group 2: Impact on Publishers - The lawsuit states that ChatGPT undermines web publishers like Britannica by providing responses that directly compete with their content, thereby depriving them of revenue [3] - Britannica argues that the hallucinations produced by ChatGPT threaten public access to high-quality and trustworthy online information [3] - Other publishers and writers, including The New York Times and various newspapers across the U.S. and Canada, have also initiated legal actions against OpenAI over similar copyright concerns [3] Group 3: Legal Precedents - There is currently no strong legal precedent regarding the use of copyrighted content for training language models, although a case involving Anthropic suggests that such use could be considered transformative [5] - In the Anthropic case, a federal judge ruled that while the use of content as training data could be legal, the company violated the law by illegally downloading millions of books, resulting in a $1.5 billion class action settlement [5]