Since announcing a beta tool last year allowing self-published authors to generate AI "virtual voice" narrations for their ebooks, over 40,000 AI-narrated titles have flooded onto Audible,...
I definitely do not want to support this practice, but there's no way to filter these out 😠.
One blogger cited in the report claimed converting an ebook to audio using the AI narration took just 52 minutes
This does not inspire confidence. The technology is there to do this very well, but it takes skill and effort. The technology to automate it end to end with high quality does not yet exist.
52 minutes. That's maybe 1/10th the time it would take to listen to it. I wonder how much of these 40,000 books were even proof-listened once.
Honestly, I don't really care if the LLM can spit out a perfect replica of Stephen Fry with every inflection and intonation possible and in the correct spots.
Tools like these can and will be used to take jobs from actual voice actors. I want no part of it.
I get where you're coming from, but it doesn't sit quite right with me. The whole point of technology is to save human time and effort. That should be a good thing. The problem is the capitalist hellscape that is the status quo. I don't think we should put the onus of propping up that capitalist hellscape onto book authors. I mean, maybe that's the easiest way to maintain the status quo, but the status quo was never sustainable in the first place.
I don't know. This is not a fully fleshed out philosophy. At some level I'm sure it's the same old idealism-vs-pragmatism debate.
Yep, it's already happening. I did freelance voice work for a client for awhile but was replaced by a voice model because it's vastly cheaper, even if the output is also proportionally worse.
I agree with the job loss part, but it seems like a really weak argument. What about the increase deals for the author? Many steps in progress lead to job losses, because the world changes. What's important is to do it in a responsible manner, and I think that's where Amazon is failing.
I very much understand the misgivings about this, and certain parts make me uncomfortable with it, too. But this could be revolutionary for media accessibility, and in my mind could easily be worth it for the ability to make new media immediately accessible to folks with vision challenges, deaf and hard of hearing individuals, and a lot of other folks for whom most media is not easily interactive/accessible. For many people in this situation, you wait months after a traditional version of something is published before an accessible version is released, if it ever is. Often, it's just not seen as worth a publisher's time to make their content accessible to an audience they don't see as significantly profitable.
Like the printing press took jobs from scribes, but had far more significant impacts democratizing information and education, so might AI in the long run.
But this could be revolutionary for media accessibility, and in my mind could easily be worth it for the ability to make new media immediately accessible to folks with vision challenges, deaf and hard of hearing individuals, and a lot of other folks for whom most media is not easily interactive/accessible
As an accessibility add-on / upgrade to standard TTS, sure. Sounds great, even. But I will not accept soulless, robotic, AI-generated voices for something being sold as an audiobook. I just won't.
What about if we sweeten the deal and allow you to choose the voice actor on the fly? Want Star wars novels read by James Earl Jones, or Tolkien read by Arnold Schwarzenegger? You can have that with AI voices.
"I will not accept soulless robotic machine-generated typescript being sold as a book. I just won't." -Some hipster arguing against the printing press
Not denying audiobook performers don't have real valuable talent (and should be fairly paid for it), but when was the last time you paid a premium for a handwritten novel?
What if Amazon sold TTS voice packages that can read any novel in your catalogue? "Hello yes I would like the James Earl Jones voice package and every Star Wars book ever written please." But the existing audiobook storefront still had only audiobooks read by real people in it, protecting their jobs.
Have y’all ever tried listening to an audiobook but had to give up because you didn’t like the reader? Imagine being able to choose the voice, the accent, the rhythm, the speed.
Imagine a future where you could train a model to read it in your own voice so that you could read to your loved ones even after you’ve passed.
Yeah but I’m talkin’ any book. Not just the ones I could take the time to record myself reading. Imagine anyone being able to have any book read to them in any voice, including my deceased grandmother’s. We have some recordings of her voice that would be enough to train AI so I could have her read me any book. That’s what I’m talkin’ about.
Yes, I have. And I don't want to choose every detail of a performance no matter how simple.
I want to hear a performance independent of my own experience. I want to hear something that I didn't know I wanted.
AI can only show you an elaborate mixture of the data it was trained on and a set of instructions. It cannot make decisions, iterate on experiences, or have creatively beneficial mistakes like a human does.
Considering how good the technology is now and how it will continue to improve, I think we’ll soon have a hard time telling the difference. I can’t see the value in having a human spend hours reading and recording and editing when a program will be able to do it almost instantly in the near future.