Most argue training with copyrighted data is fair use.
AI companies have all kinds of arguments against paying for copyrighted content::The companies building generative AI tools like ChatGPT say updated copyright laws could interfere with their ability to train capable AI models. Here are comments from OpenAI, StabilityAI, Meta, Google, Microsoft and more.
Tough tits. Imagine all the books, movies and games that could have been made if copyright didn't exist. Nobody else gets to ignore the rules just because it's inconvenient.
And if it's ok. Then what's the limit on what an AI is, do you have to prove an AI made it? Or can you just write some repetitive work and say it's made by AI and dodge copyright?
Why can’t they? Sure if you ask it to make a pixel perfect replica of the Mona Lisa it will probably do it. But if you’re asking it to make something new it will make something new.
Then you don't need to use copyrighted material as training data. If the AI can be creative and make new things it doesn't need the copyrighted material. Just tell it to make new things and use the new things to make other new things.