
Millions of books given away for free to train AI? Meta wins AI copyright case, U.S. court: fair use!

The judge ruled that Meta's use of millions of books to train its artificial intelligence models constitutes "fair use," marking a significant victory for tech companies. Nevertheless, the judge warned that the plaintiffs' arguments were insufficient and raised concerns that AI could lead to "market dilution," undermining human creative motivation
On Thursday, Meta was determined to have achieved a preliminary victory in the AI copyright case.
San Francisco District Judge Vince Chhabria ruled that Meta's use of millions of books to train its artificial intelligence models constitutes "fair use," marking a significant victory for tech companies in using copyrighted materials to develop AI.
The tech giant argued that these works were used to develop a transformative technology, which falls under the category of "fair" regardless of how they were obtained.
The lawsuit was initiated by a dozen writers, including Ta-Nehisi Coates and Richard Kadrey, who questioned how the $1.4 trillion social media giant utilized a repository containing millions of online books, academic articles, and comics to train its Llama AI model.
However, Judge Chhabria also cautioned in his ruling that his decision reflects the plaintiffs' failure to properly present their arguments. "This ruling does not mean that Meta's use of copyrighted materials to train its language model is legal," he stated, "it merely indicates that these plaintiffs presented the wrong arguments and failed to provide a record supporting the correct arguments."
Judge's Definition of "Fair Use"
This is the second victory for tech companies in the field of AI development this week.
On Monday, a federal court ruled in favor of the San Francisco startup Anthropic in a similar case. Anthropic trained its Claude model using legally purchased physical books that were cut and manually scanned, and the court found this constituted "fair use." However, the judge added that separate trials are needed for allegations against the company regarding digital piracy of millions of books for training.
Meta's case involves LibGen, a so-called online "shadow library," most of whose content is hosted without the permission of copyright holders.
Judge Vince Chhabria's ruling provides an important legal precedent for tech companies using copyrighted materials in AI training. Meta argued that its use of these books was to develop a technology with "transformative" potential, namely the Llama AI model, which aligns its actions with the principles of "fair use." The court adopted this view, concluding that the plaintiffs failed to sufficiently prove that Meta's actions did not fall under the fair use category.
Potential Concerns of "Market Dilution"
Despite Meta's victory, Judge Chhabria also raised a "potentially successful argument" regarding "market dilution." He noted that AI products have the ability to "flood the market in unlimited ways, including images, songs, articles, books, etc.," which could harm copyright holders.
Chhabria expressed concern that AI could "greatly diminish the motivation for humans to create in traditional ways." He warned that:
“People only need to spend a minimal amount of time and creativity to generate these outputs through generative AI models, which previously required a significant amount of effort.”
This case is one of dozens of legal disputes currently under review. With the rapid development of artificial intelligence technology, creators are seeking greater economic rights when their works are used to train AI models that could disrupt their livelihoods, while technology companies are reaping huge profits from this technology