Authors Guild v. OpenAI: novelists sue over training data

On September 19, 2023, the Authors Guild and seventeen prominent novelists filed a class-action copyright suit against OpenAI in the US District Court for the Southern District of New York. The named plaintiffs read like a bestseller list: John Grisham, George R.R. Martin, Jodi Picoult, David Baldacci, Jonathan Franzen, Michael Connelly, Scott Turow, George Saunders, and others. The Authors Guild, the oldest and largest professional organization for writers in the United States, brought the case on behalf of a proposed class of fiction authors whose books had allegedly been used to train OpenAI’s GPT models.

The complaint alleged that OpenAI copied the authors’ copyrighted works wholesale to build the training corpus behind GPT-3.5 and GPT-4, the models powering ChatGPT. The Guild argued that these books were ingested without license or payment, and that the resulting models could generate new text imitating the authors’ styles - potentially competing with the writers whose work made the system possible. The Guild’s framing was blunt: without the authors’ copyrighted works, OpenAI “would have a vastly different commercial product.” On December 4, 2023, the plaintiffs filed an amended complaint adding Microsoft, OpenAI’s largest backer, as a co-defendant.

The case was later consolidated with related author suits and the New York Times’ separate action against OpenAI and Microsoft, becoming part of a sprawling multidistrict copyright fight over generative AI. The core legal question - whether training a commercial language model on copyrighted books is fair use - sits at the center of the entire industry’s exposure.

Why business readers should care: this suit, alongside the Times case, turned the abstract debate over AI training data into concrete litigation risk for any company that builds or licenses large language models. The outcome shapes whether AI developers must pay to license training content or can rely on a fair-use defense.