
By Matt O’Brien and Barbara Ortutay | Associated Press
SAN FRANCISCO — A federal judge sided with Facebook parent Meta Platforms in dismissing a copyright infringement lawsuit from a group of authors who accused the company of stealing their works to train its artificial intelligence technology.
The Wednesday ruling from U.S. District Judge Vince Chhabria was the second in a week from San Francisco’s federal court to dismiss major copyright claims from book authors against the rapidly developing AI industry.
Chhabria found that 13 authors who sued Meta “made the wrong arguments” and tossed the case. But the judge also said that the ruling is limited to the authors in the case and does not mean that Meta’s use of copyrighted materials is lawful.
Lawyers for the plaintiffs — a group of well-known writers that includes comedian Sarah Silverman and authors Jacqueline Woodson and Ta-Nehisi Coates — didn’t immediately respond to a request for comment Wednesday. Meta also didn’t immediately respond to a request for comment.
“This ruling does not stand for the proposition that Meta’s use of copyrighted materials to train its language models is lawful,” Chhabria wrote. “It stands only for the proposition that these plaintiffs made the wrong arguments and failed to develop a record in support of the right one.”
Although Meta prevailed in its request to dismiss the case, it could turn out to be a pyrrhic victory. In his 40-page ruling, Chhabria repeatedly indicated reasons to believe that Meta and other AI companies have turned into serial copyright infringers as they train their technology on books and other works created by humans, and seemed to be inviting other authors to bring cases to his court presented in a manner that would allow them to proceed to trial.
While posing the question of whether companies have been engaging in illegal conduct by feeding copyright-protected material into AI training models without permission, the judge wrote: “Although the devil is in the details, in most cases the answer will likely be yes.”
Chhabria reiterated “in many circumstances it will be illegal to copy copyright-protected works to train generative AI models without permission. Which means that the companies, to avoid liability for copyright infringement, will generally need to pay copyright holders for the right to use their materials.”
Related Articles
Pope Leo calls for an ethical AI framework in a message to tech execs gathering at the Vatican
WhatsApp to start showing ads to users in some parts of the messaging app
Meta Investors cheer as Mark Zuckerberg doubles down on AI commitment
Google, Meta and Snap think this tech is the next big thing
Meta taps top researchers from Google, Sesame for new AI lab
The judge also scoffed at arguments that requiring AI companies to adhere to decades-old copyright laws would slow down advances in a crucial technology at a pivotal time. “The technology is certainly groundbreaking,” Chhabria wrote. “But the suggestion that adverse copyright rulings would stop this technology in its tracks is ridiculous. These products are expected to generate billions, even trillions of dollars for the companies that are developing them. If using copyrighted works to train the models is as necessary as the companies say, they will figure out a way to compensate copyright holders for it.”
On Monday, from the same courthouse, U.S. District Judge William Alsup ruled that AI company Anthropic didn’t break the law by training its chatbot Claude on millions of copyrighted books, but the company must still go to trial for illicitly acquiring those books from pirate websites instead of buying them.
But the actual process of an AI system distilling from thousands of written works to be able to produce its own passages of text qualified as “fair use” under U.S. copyright law because it was “quintessentially transformative,” Alsup wrote.
Chhabria, in his Meta ruling, criticized Alsup’s reasoning on the Anthropic case, arguing that “Alsup focused heavily on the transformative nature of generative AI while brushing aside concerns about the harm it can inflict on the market for the works it gets trained on.”
Chhabria suggested that a case for such harm can be made.
In the Meta case, the authors had argued in court filings that Meta is “liable for massive copyright infringement” by taking their books from online repositories of pirated works and feeding them into Meta’s flagship generative AI system Llama.
Lengthy and distinctively written passages of text — such as those found in books — are highly useful for teaching generative AI chatbots the patterns of human language. “Meta could and should have paid” to buy and license those literary works, the authors’ attorneys argued.
Meta countered in court filings that U.S. copyright law “allows the unauthorized copying of a work to transform it into something new” and that the new, AI-generated expression that comes out of its chatbots is fundamentally different from the books it was trained on.
“After nearly two years of litigation, there still is no evidence that anyone has ever used Llama as a substitute for reading Plaintiffs’ books, or that they even could,” Meta’s attorneys argued.
Meta says Llama won’t output the actual works it has copied, even when asked to do so.
“No one can use Llama to read Sarah Silverman’s description of her childhood, or Junot Diaz’s story of a Dominican boy growing up in New Jersey,” its attorneys wrote.
Accused of pulling those books from online “shadow libraries,” Meta has also argued that the methods it used have “no bearing on the nature and purpose of its use” and it would have been the same result if the company instead struck a deal with real libraries.
Such deals are how Google built its online Google Books repository of more than 20 million books, though it also fought a decade of legal challenges before the U.S. Supreme Court in 2016 let stand lower court rulings that rejected copyright infringement claims.
The authors’ case against Meta forced CEO Mark Zuckerberg to be deposed, and has disclosed internal conversations at the company over the ethics of tapping into pirated databases that have long attracted scrutiny.
“Authorities regularly shut down their domains and even prosecute the perpetrators,” the authors’ attorneys argued in a court filing. “That Meta knew taking copyrighted works from pirated databases could expose the company to enormous risk is beyond dispute: it triggered an escalation to Mark Zuckerberg and other Meta executives for approval. Their gamble should not pay off.”
“Whatever the merits of generative artificial intelligence, or GenAI, stealing copyrighted works off the Internet for one’s own benefit has always been unlawful,” they argued.
The named plaintiffs are Jacqueline Woodson, Richard Kadrey, Andrew Sean Greer, Rachel Louise Snyder, David Henry Hwang, Ta-Nehisi Coates, Laura Lippman, Matthew Klam, Junot Diaz, Sarah Silverman, Lysa TerKeurst, Christopher Golden and Christopher Farnsworth.
Most of the plaintiffs had asked Chhabria to rule now, rather than wait for a jury trial, on the basic claim of whether Meta infringed on their copyrights. Two of the plaintiffs, Coates and Golden, did not seek such summary judgment.
Chhabria said in the ruling that while he had “no choice” but to grant Meta’s summary judgment tossing the case, “in the grand scheme of things, the consequences of this ruling are limited. This is not a class action, so the ruling only affects the rights of these 13 authors — not the countless others whose works Meta used to train its models.”
AP Technology Writer Michael Liedtke contributed to this story.