Google is fighting a lawsuit that could reshape how tech giants use creator content for AI training. A group of independent musicians claims the company scraped songs they uploaded to YouTube to train its Lyria 3 music generation model - and Google's response suggests it believes its terms of service give it that right. The case, filed as Kogon v. Google LLC, lands as AI companies face mounting legal pressure over training data practices.
Google is walking a careful legal tightrope, refusing to confirm whether it actually used YouTube creator content to train Lyria 3 while simultaneously arguing it has every right to do so. The lawsuit, reported by Billboard, centers on claims from independent artists who say their original compositions uploaded to YouTube became fodder for Google's music generation AI without permission or compensation.
According to court documents, the plaintiffs allege Google "illegally used songs" to train Lyria 3, the company's latest music AI model capable of generating original compositions. But Google's motion to dismiss reveals the company's strategy: attack the premise without addressing the substance. "Their lawsuit is based on the unsupported hypothesis that Google trained on their specific works," the filing states, as reported by Music Business Worldwide.
The twist? Google immediately follows that denial with a claim that even if the allegations were true, "the Complaint cannot stand" because creators granted YouTube and Google "a broad license to use the uploaded content." It's a legal maneuver that essentially says: We didn't do it, but if we did, we were allowed to.
This case arrives as AI training data practices face unprecedented scrutiny. Major AI companies have faced similar lawsuits over text, image, and code training data, but music represents particularly sensitive territory. Unlike text scraped from the open web, YouTube content comes with an explicit creator relationship - artists upload work expecting it to be watched and monetized, not dissected by neural networks.
The timing coincides with Google's broader AI ambitions. Lyria, developed by Google's DeepMind division, represents the company's play in generative AI beyond text and images. The model can generate multi-minute songs with vocals, instrumentation, and production quality that rivals human creators. But that capability requires massive training datasets, and YouTube's catalog contains billions of hours of music content.
Here's what makes this lawsuit different from previous AI training cases: the terms of service argument. When creators upload to YouTube, they agree to language that grants the platform rights to "use, reproduce, distribute, and display" their content. But those terms were written years before AI training became standard practice. The legal question is whether those broad licenses implicitly include permission to train AI models that could potentially compete with the original creators.
The plaintiffs aren't household names - they're independent musicians, the kind of creators YouTube has long claimed to champion. That makes Google's defense more uncomfortable. If the company prevails by arguing its terms of service allow AI training, millions of creators could discover they've been inadvertently licensing their work for purposes they never imagined.
Google hasn't publicly disclosed what training data Lyria uses, which is standard practice for major AI companies. OpenAI, Meta, and others have faced similar opacity critiques around their training datasets. But YouTube's unique position as a creator platform - not just a content repository - adds complexity Google's competitors don't face.
The case could establish precedent for how platforms leverage user-generated content in the AI era. If courts rule that existing terms of service don't cover AI training, platforms would need explicit opt-in consent - fundamentally changing how AI companies source training data. Alternatively, a ruling in Google's favor could confirm that creators signing platform agreements in 2024 are implicitly agreeing to AI training uses not yet invented.
For now, Google is betting that ambiguity works in its favor. Don't confirm the training happened, but make clear you believe you had the right. It's a strategy that keeps options open while the legal landscape around AI training data remains unsettled.
This lawsuit represents more than a dispute between independent musicians and a tech giant - it's a test case for whether platform terms of service written in the pre-AI era can authorize uses creators never contemplated. As generative AI models grow more sophisticated, the hunger for training data intensifies, and user-generated content platforms like YouTube sit on treasure troves companies desperately want to tap. Whether Google confirms or denies using YouTube music for Lyria training might matter less than whether courts decide the company needed to ask permission in the first place. The outcome will shape how every platform balances creator relationships against AI development ambitions.