Bartz v. Anthropic PBC is a copyright case in the United States District Court for the Northern District of California, docketed as No. 3:24-cv-05417 and filed on August 19, 2024 by authors Andrea Bartz, Charles Graeber, and Kirk Wallace Johnson. On June 23, 2025, Judge William Alsup issued an “Order on Fair Use” (Document 231) on summary judgment. It was the first substantive merits ruling on whether training an AI model on copyrighted books is fair use, a question that the larger wave of AI-copyright suits, including 2023-nyt-v-openai, had put before the courts but that no court had yet decided on the merits.
The order split the question. Alsup held that the copies “used to train specific LLMs” were a fair use, writing that “the technology at issue was among the most transformative many of us will see in our lifetimes,” and that converting lawfully purchased print books into a digital library was also fair use because, as the order put it, “the purchased print copy was destroyed and its digital replacement not redistributed.” But he held that “the downloaded pirated copies used to build a central library were not justified by a fair use,” finding “every factor points against fair use” for the pirated copies, which Anthropic employees said would be retained “forever” for “general purpose.” The order granted summary judgment that the training use was fair use, denied it as to the pirated library copies, and set those for trial on damages, “actual or statutory (including for willfulness).”
Rather than try the damages question, the parties reached a class settlement. According to the court record, Anthropic agreed to pay $1.5 billion into a settlement fund covering roughly 500,000 works from the pirated LibGen and PiLiMi datasets (about $3,000 per work) and to destroy the pirated libraries. The court granted preliminary approval on September 25, 2025, with a claims deadline of March 30, 2026. As of this writing (June 6, 2026), final approval had not yet been granted: at the May 14, 2026 fairness hearing the court took the matter under submission rather than ruling from the bench, so the settlement remained pending final court approval.
Why business readers should care: this is the first court ruling to draw a concrete line through the AI-training data debate, and the line it drew matters. On these facts, training on books was treated as transformative fair use, while how the training data was obtained, specifically the use and retention of pirated copies, became the basis for massive liability. For companies building on or training models, the case suggests that lawful acquisition of training data, not just the act of training, can be decisive, and the $1.5 billion figure sets a reference point for the cost of getting it wrong. The questions remain contested across other cases, and following the court record itself is the most reliable way to track them.