
A federal judge in San Francisco, Vince Chhabria, recently challenged Meta Platforms’ claim that it can use copyrighted books without permission to train its artificial intelligence models. This case is big because it could set a precedent for how AI companies use copyrighted materials in the future.
What Happened in Court?
Authors like Junot Diaz and comedian Sarah Silverman are suing Meta, arguing that the company used unauthorized copies of their books to train its Llama AI model without paying or getting consent.
Meta admits to using these works but argues it’s legal under the “fair use” doctrine, which sometimes allows copyrighted material to be used without permission for things like education, commentary, or creating something new and transformative.
Judge Chhabria expressed skepticism about Meta’s defense, pointing out that AI models like Llama can generate endless new content, potentially replacing the original works in the market. He questioned how this could be considered fair use if it harms the authors’ ability to sell their books.
The judge also pressed the authors’ lawyer to show clear evidence that Meta’s AI actually hurts the market for their specific books, not just in theory but in practice.
Key Arguments
Meta’s Side: Meta says using books to train AI is transformative-it helps the AI learn language and generate new content, not just copy the books. They argue that forcing AI companies to pay for every piece of content used would slow down innovation and make AI development much more expensive.
Authors’ Side: The authors argue that Meta is copying their work on a massive scale to create a product that can directly compete with them, threatening their income and rights as creators. They also allege Meta used “shadow libraries” (websites that host pirated books) to obtain the training data.
Related articles you may find interesting
Why Does This Matters?
The outcome could affect not just Meta, but the entire AI industry, including companies like OpenAI and Anthropic, which also use large datasets to train their models.
If the court rules against Meta, AI companies may have to change how they gather training data, possibly paying for licenses or limiting what they use. If Meta wins, it could make it easier for AI developers to use copyrighted material without permission, as long as the use is considered “transformative.”
Conclusion
The case is part of a broader wave of lawsuits in the US and internationally, as creators and publishers push back against tech companies using their work to train AI without compensation.
Separately, Meta is also facing claims that it removed copyright management information from materials used to train its models, which could be a violation of the US Digital Millennium Copyright Act (DMCA).
Legal experts say this case could reshape copyright law for the AI era, balancing the rights of creators with the needs of technology companies to access large amounts of data for training.
In summary, the judge is questioning whether Meta’s use of copyrighted books for AI training is truly fair use, especially if it risks replacing the original works in the market. The decision could have major implications for both the tech industry and creative professionals.