Likely setting a precedent that will be followed by others suing AI companies over training of LLM models, Anthropic has agreed to a $1.5 billion settlement with book authors.
Understanding Generative AI and Training Data
Generative AI systems are trained on massive datasets containing billions of words from books, articles, websites, Wikipedia entries, and other text sources. The training process is fundamentally statistical: the AI learns patterns in language by repeatedly trying to predict the next word in a sequence, gradually adjusting its internal parameters based on prediction errors. The LLM improves as these prediction errors are minimized.
The generative AI system develops a complex mathematical representation of language patterns that allows it to generate new, coherent text. However, the distinction between learning patterns and reproducing content has proven legally significant, as AI systems can sometimes generate text that closely mirrors their training materials and can likely reproduce training materials exactly if specifically asked.
The training data for these systems almost always includes copyrighted works — books, articles, and other protected content — frequently obtained without permission from rights holders. This practice has become the foundation of numerous legal challenges across creative industries.
The Anthropic Decision: Fair Use vs. Pirated Content
On June 23, 2025, Judge William Alsup delivered a complex ruling that distinguished between different types of AI training practices.[1] The court held that using legally purchased copyrighted books to train AI models constitutes transformative fair use. The judge emphasized that AI systems learn from works to generate new output rather than to copy or directly compete with the original materials. As the use was transformative and fair no further payment to the authors was needed.
The Settlement: A New Precedent
The settlement, which provides approximately $3,000 per covered work, represents the largest copyright recovery in AI litigation to date. Additionally, while maintaining its fair use defense for lawfully acquired materials, Anthropic has agreed to delete all pirated works from its training systems.
This outcome sends a clear message to the AI industry: how companies acquire their training data is as legally significant as how they use it. The settlement suggests that while fair use may protect the transformative use of legally obtained copyrighted works, obtaining materials through unauthorized channels is not protected.
Broader Implications for the Industry
The Anthropic settlement is part of a broader wave of copyright litigation facing AI companies. However, as a district court decision at the summary judgement stage it may have a limited impact. The settlement and flood of cases, though, likely mean that the era of AI companies freely using copyrighted materials without consideration of licensing or permission is ending.
What This Means for Content Creators
For authors, journalists, and other content creators, the Anthropic decision re-establishes several important principles:
- Source Matters: Courts will distinguish between AI training on legally obtained versus pirated materials.
- Fair Use Protection: Transformative use of legally purchased works may receive fair use protection.
- Recovery Potential: Significant monetary recoveries are possible when companies use unauthorized sources.
Larger impacts on the overall AI industry and AI copyright landscape are harder to predict. Especially as the proposed settlement was criticized by the presiding judge.[2] However, the following seems clear:
- Model Decision: Other courts may have different opinions on fair use, but this decision will influence their thinking. Agree or disagree with it, they won’t be able to ignore it.
- Birth of Book Sale Terms: To prevent use of legally sold books in AI training, books may be sold with sale/license terms reminiscent of those included with software. As this would be a major change in the way books are sold likely would be litigated. Software sale/license terms have generally been upheld.
- Birth of AI Training Market: The price point set in the settlement is based on statutory rather than economic considerations but is still likely to be used as a starting point for negotiations in other cases and negotiations between publishers and AI companies.
- Book Buying Spree: Anticipating the above AI companies are likely to go on a buying spree in an attempt to legally acquire books before new license terms can be applied.
Are you an author whose work may have been used without permission to train AI systems?
If you believe your copyrighted works have been incorporated into AI training datasets without authorization, you have legal recourse! Significant recoveries, as in Anthropic, are possible.
Contact Chambliss Today to discuss your potential claims and explore your options for protecting your intellectual property rights in the age of artificial intelligence. Our experienced attorneys can help you understand your rights and determine the best path forward.
Stay Tuned for Updates! As mentioned above, the presiding judge was critical of the settlement offer. Chambliss will continue to publish updates in this fast-moving area.
[1] See Bartz et al. v. Anthropic PBC, No. 3:24-cv-05417 (N.D. Cal. Aug 19, 2024).
[2] Judge skewers $1.5B Anthropic settlement with authors in pirated books case over AI training | AP News