AI Lie: Machines Don’t Learn Like Humans (And Don’t Have the Right To)
AI Lie: Machines Don’t Learn Like Humans (And Don’t Have the Right To)

AI Lie: Machines Don’t Learn Like Humans (And Don’t Have the Right To)

Avram Piltch is the editor in chief of Tom's Hardware, and he's written a thoroughly researched article breaking down the promises and failures of LLM AIs.
They have the right to ingest data, not because they're “just learning like a human would". But because I - a human - have a right to grab all data that's available on the public internet, and process it however I want, including by training statistical models. The only thing I don't have a right to do is distribute it (or works that resemble it too closely).
In you actually show me people who are extracting books from LLMs and reading them that way, then I'd agree that would be piracy - but that'd be such a terrible experience if it ever works - that I can't see it actually happening.
Two things:
I'm sick and tired of this "parrots the works of others" narrative. Here's a challenge for you: go to https://huggingface.co/chat/, input some prompt (for example, "Write a three paragraphs scene about Jason and Carol playing hide and seek with some other kids. Jason gets injured, and Carol has to help him."). And when you get the response, try to find the author that it "parroted". You won't be able to - because it wouldn't just reproduce someone else's already made scene. It'll mesh maaany things from all over the training data in such a way that none of them will be even remotely recognizable.
Is there a meaningful difference between reproducing the work and giving a summary? Because I’ll absolutely be using AI to filter all the editorial garbage out of news, setup and trained myself to surface what is meaningful to me stripped of all advertising, sponsorships, and detectable bias
You're making two, big incorrect assumptions:
I know these are really tough pills for AI fans to swallow, but you know what they say... "If it seems too good to be true, it probably is."
One the contrary - the reason copyright is called that is because it started as the right to make copies. Since then it's been expanded to include more than just copies, such as distributing derivative works
But the act of distribution is key. If I wanted to, I could write whatever derivative works in my personal diary.
I also have the right to count the number of occurrences of the letter 'Q' in Harry Potter workout Rowling's permission. This I can also post my count online for other lovers of 'Q', because it's not derivative (it is 'derived', but 'derivative' is different - according to Wikipedia it means 'includes major copyrightable elements').
Or do more complex statistical analysis.