Unsealed Court Documents Reveal Meta's Possible Use of Questionably Acquired Copyrighted Content for AI Model Training

Published: 21 Feb 2025
Meta is in the spotlight once again – this time for allegedly using copyrighted works to train their AI models.

Hinting at the potential illegal use of copyrighted works to train its artificial intelligence models, recent court documents have thrown Meta into hot water once more. The documents, unsealed during the Kadrey v. Meta case, give a clear indication of the thought process behind Meta’s purported use of copyright-protected data for model training.

According to the documents, Meta employees, including senior manager of the proprietary Llama model research team, Melanie Kambadur, allegedly discussed training models on works that could pose legal difficulties. Xavier Martinet, a Meta research engineer, reportedly suggested bypassing licensing deals with individual book publishers to build training datasets instead.

However, not everyone at Meta seemed to agree with this approach. A Meta staffer pointed out the potential legal backlash from using unauthorized copyrighted content. This did not dissuade Martinet who countered by suggesting that at the worst, they would find out it was legally okay—a sentiment that now holds incredible irony given the pushback in court.

Underlined by the allegations is an unsteady balance between expediency in technological progression and necessary considerations for intellectual property rights. As AI and technology continue to evolve, this landmark lawsuit could set a precedent for how companies handle the delicate matter of copyrighted content in AI training moving forward.