OpenAI, Microsoft Used Copyrighted Nonfiction Works to Train ChatGPT, Class Action Claims
Sancton v. OpenAI, Inc. et al.
Filed: November 21, 2023 ◆§ 1:23-cv-10211
A nonfiction writer claims in a class action that OpenAI used his copyrighted work to train its AI models without his permission and without providing him compensation.
Microsoft Corporation OpenAI Inc. OpenAI LLC OpenAI GP LLC OpenAI OpCo LLC OpenAI Global LLC OAI Corporation LLC OpenAI Holdings LLC
New York
ChatGPT-maker OpenAI has been hit with a proposed class action filed on behalf of nonfiction authors who claim their copyrighted works were used to train the company’s artificial intelligence (AI) models without their permission or compensation.
Want to stay in the loop on class actions that matter to you? Sign up for ClassAction.org’s free weekly newsletter here.
The 28-page lawsuit explains that OpenAI’s large language models (LLMs)—including GPT-3, GPT-3.5, GPT-4 and GPT-4 Turbo—power ChatGPT and its suite of other AI products by generating text output that mimics human-like responses.
The case alleges that OpenAI, along with major investor and business partner Microsoft Corporation, “trained” this technology to imitate human writing by feeding it unlicensed copies of at least tens of thousands of nonfiction books they never paid for.
According to the suit, OpenAI and Microsoft have enjoyed huge profits from their “large-scale copyright infringement” at the expense of writers, who have been deprived of book sales and licensing revenues.
Per the complaint, OpenAI’s LLMs were calibrated using a training set consisting of 45 terabytes of data—equivalent to “several billion pages of single-spaced text”—scraped indiscriminately from the internet. OpenAI states that this data, which the suit alleges included a “massive quantity” of pirated and copyrighted nonfiction material, was then used to teach its AI models to “learn” “how words fit together grammatically,” “how words work together to form higher-level ideas” and “how sequences of words form structured thoughts.”
“The end result is a computer model that is not only built on the work of thousands of creators and authors, but also built to generate a wide range of expression—from shortform articles to book chapters—that mimics the syntax, style, and themes of the copyrighted works on which it was trained,” the filing says.
The filing contends that the defendants’ “rampant theft” and reproduction of class members’ intellectual property violates exclusive rights granted to the writers under the federal Copyright Act.
The lawsuit looks to represent anyone who owns copyrighted literary works that are registered with the United States Copyright Office, are works of nonfiction, and either are or have been assigned an International Standard Book Number (ISBN) or are published in an academic journal that were or are used by OpenAI and Microsoft to train their generative AI models.
Get class action lawsuit news sent to your inbox – sign up for ClassAction.org’s free weekly newsletter here.
Hair Relaxer Lawsuits
Women who developed ovarian or uterine cancer after using hair relaxers such as Dark & Lovely and Motions may now have an opportunity to take legal action.
Read more here: Hair Relaxer Cancer Lawsuits
How Do I Join a Class Action Lawsuit?
Did you know there's usually nothing you need to do to join, sign up for, or add your name to new class action lawsuits when they're initially filed?
Read more here: How Do I Join a Class Action Lawsuit?
Stay Current
Sign Up For
Our Newsletter
New cases and investigations, settlement deadlines, and news straight to your inbox.
Before commenting, please review our comment policy.