A new research paper laid out ways in which AI developers should try and avoid showing LLMs have been trained on copyrighted material.
OpenAI now tries to hide that ChatGPT was trained on copyrighted books, including J.K. Rowling's Harry Potter series::A new research paper laid out ways in which AI developers should try and avoid showing LLMs have been trained on copyrighted material.
Couple things. That was wrong then as it is wrong today. Training data isn't file sharing. Too many of you are ushering in a new era of spying and erosion of the internet on behalf of corporations under the guise of " protecting artists" like they did in Napster days.
Not at all, I simply recognize that the argument may have merit as I said. I never said which side of the isle I personally fall on. Also they are a company so theoretically the scrutiny on the methods they use to acquire data is deserved. Data has a price whether you think it should or shouldn't.
And my opinion is if it has a price don't give it away free online where anyone or anything can I ingest it. Should webcrawlers be paying websites for indexing them?
I also believe in private property. If I buy a book I can do what I want with it. Like use it to train AI. It is my property.
Which are 2 contradictory philosophies, how can one simultaneously supposedly not care if someone's private property is stolen yet believes in private property rights? The argument would indeed be if they stole the book off the internet versus bought a copy themselves.