• Disclosing AI
  • Posts
  • Creators are standing up against AI giants to defend their rights

Creators are standing up against AI giants to defend their rights

AI DATA

Recent developments underscore the ongoing tension between AI companies and content creators regarding the use of training data.

A few months ago, Google warned OpenAI not to use YouTube data for training its models. However, recent reports from The New York Times suggest that OpenAI, Meta, and Google may not have adhered to this guideline.

AI companies often claim their models are trained on “publicly available data,” but the term remains ambiguous.

Mira Murati, OpenAI's former CTO, avoided providing specifics on the data used for models like Sora, stating it was “publicly available or licensed,” a response that many consider vague and troubling.

Ed Newton-Rex, a former Stability AI team lead, resigned over the company's stance that training on copyrighted material constitutes "fair use." He believes that creators are harmed when AI models generate similar content.

Publishers such as The New York Times now prohibit AI companies from using their content, but enforcing such restrictions is challenging without clearer legal guidelines.

Here’s what you need to know:

  • “Publicly available” does not equate to consent. AI companies frequently use this term, but it doesn’t imply that creators have agreed to their content being used.

  • Meta, Google, and OpenAI have been accused of utilizing copyrighted material without proper licenses.

  • Creators are taking legal action against AI companies, but existing laws don’t fully cover these issues yet.

As online data becomes scarcer, AI companies are scrambling to secure enough content to train their models.

Some are taking risky shortcuts:

  • Meta's past actions: Court documents reveal that in 2016, Meta accessed data from platforms like Snapchat, YouTube, and Amazon, including sensitive information like usernames and passwords.

  • Bypassing copyright risks: Reports suggest that Meta, OpenAI, and Google have used copyrighted material without proper clearance to stay competitive and save time.

Creators are increasingly turning to lawsuits to defend their work, but their success has been limited.

For example, a federal judge recently dismissed most of the copyright claims in a lawsuit filed by authors like Ta-Nehisi Coates and Sarah Silverman.

Until stronger regulations are in place, both creators and AI companies will continue to face legal and ethical challenges.

The next few years could bring significant rulings and legislation that will define how data is collected and how creators can protect their content.

It feels like the Wild West of data collection.

Thank you for Reading.

Know someone who'd love this newsletter? Share the love.🌟

Reply

or to participate.