Meta Used Copyrighted Works to Train AI Software LLaMA Without Permission, Class Action Says
Last Updated on July 11, 2024
Chabon et al. v. Meta Platforms, Inc.
Filed: September 12, 2023 ◆§ 4:23-cv-04663
Five award-winning authors allege in a class action that Meta Platforms used their copyrighted works to train its AI language models without authorization.
California
Five award-winning authors allege in a proposed class action that Meta Platforms used their copyrighted works to train its AI language models without authorization.
Want to stay in the loop on class actions that matter to you? Sign up for ClassAction.org’s free weekly newsletter here.
According to the 18-page lawsuit, Meta first released LLaMA (Large Language Model Meta AI) in February 2023 as a suite of AI software designed to respond to user prompts with “convincingly natural,” human-like text outputs.
However, the lawsuit alleges that Meta, in a “clear infringement” of the plaintiffs’ intellectual property rights, “trained” LLaMA using “massive amounts of text” from the authors’ copyrighted books and screenplays, “without consent, without credit, and without compensation,” the case alleges.
“Though a large language model is a software program, it is not created the way most software programs are—that is, by human software engineers writing code,” the filing explains. “Rather, a large language model is ‘trained’ by copying massive amounts of text from various sources and feeding these copies into the model.”
Although Meta claims that LLaMA’s training dataset consists of “publicly available” information that is “compatible with open sourcing,” the complaint contends that the company pulls data from Bibliotik, an illegal “shadow library” website that contains a large quantity of copyrighted material.
Per the suit, Meta admits that it sourced a portion of the LLaMA training materials from Books3, a major section of a dataset known as The Pile. Public statements released by the creator of Books3 reveal that it represents “all of Bibliotik” and contains 196,640 books, the complaint says.
The plaintiffs claim that many of their written works appear in the Books3 dataset Meta used to train LLaMA. As a result, the AI product now relies on their copyrighted works to fuel its responses, the case shares.
“Plaintiffs never authorized Meta to make copies of their Infringed Works, make derivative works, publicly display copies (or derivative works), or distribute copies (or derivative works),” the case states. “All those rights belong exclusively to Plaintiffs under copyright law.”
The filing notes that Meta originally launched the LLaMA language models as selectively available to organizations that request access but reportedly plans to make the next version of the product commercially available. Even so, the LLaMA language models were leaked online in March 2023 and continue to circulate, the case says.
The lawsuit looks to represent all people or entities nationwide that own a United States copyright in any work that was used as training data for the LLaMA language models during the applicable statute of limitations period.
Get class action lawsuit news sent to your inbox – sign up for ClassAction.org’s free weekly newsletter here.
Hair Relaxer Lawsuits
Women who developed ovarian or uterine cancer after using hair relaxers such as Dark & Lovely and Motions may now have an opportunity to take legal action.
Read more here: Hair Relaxer Cancer Lawsuits
How Do I Join a Class Action Lawsuit?
Did you know there's usually nothing you need to do to join, sign up for, or add your name to new class action lawsuits when they're initially filed?
Read more here: How Do I Join a Class Action Lawsuit?
Stay Current
Sign Up For
Our Newsletter
New cases and investigations, settlement deadlines, and news straight to your inbox.
Before commenting, please review our comment policy.