Skip Navigation
161 comments
  • For anyone with existing Home Assistant setup, the Home Assistant Voice Preview is pretty good alternative, when it comes to voice control of HA. The setup is very easy. If you want conversational functionality, you could even hook it up to an LLM, cloud or local. It can also be used for media playback and it's got an aux out port.

    I used to use Google Home Mini for voice control of Home Assistant. The Voice Preview replaced that rather nicely.

  • Which Echo devices ever supported local only processing? They cost about £30. There's no kit that can do decent voice commands for that money. You'd be lucky to have a device that processes claps to turn the lights on for that.

  • If you traveled back in time and told J. Edgar Hoover that in the future, the American public voluntarily wire-tapped themselves, he would cream his frilly pink panties.

  • Publicly, that is. They have no doubt been doing it in secret since they launched it.

    • Off-device processing has been the default from day one. The only thing changing is the removal for local processing on certain devices, likely because the new backing AI model will no longer be able to run on that hardware.

    • If you look at the article, it was only ever possible to do local processing with certain devices and only in English. I assume that those are the ones with enough compute capacity to do local processing, which probably made them cost more, and that the hardware probably isn't capable of running whatever models Amazon's running remotely.

      I think that there's a broader problem than Amazon and voice recognition for people who want self-hosted stuff. That is, throwing loads of parallel hardware at something isn't cheap. It's worse if you stick it on every device. Companies --- even aside from not wanting someone to pirate their model running on the device --- are going to have a hard time selling devices with big, costly, power-hungry parallel compute processors.

      What they can take advantage of is that for a lot of tasks, the compute demand is only intermittent. So if you buy a parallel compute card, the cost can be spread over many users.

      I have a fancy GPU that I got to run LLM stuff that ran about $1000. Say I'm doing AI image generation with it 3% of the time. It'd be possible to do that compute on a shared system off in the Internet, and my actual hardware costs would be about $33. That's a heckofa big improvement.

      And the situation that they're dealing with is even larger, since there might be multiple devices in a household that want to do parallel-compute-requiring tasks. So now you're talking about maybe $1k in hardware for each of them, not to mention the supporting hardware like a beefy power supply.

      This isn't specific to Amazon. Like, this is true of all devices that want to take advantage of heavyweight parallel compute.

      I think that one thing that it might be worth considering for the self-hosted world is the creation of a hardened network parallel compute node that exposes its services over the network. So, in a scenario like that, you would have one (well, or more, but could just have one) device that provides generic parallel compute services. Then your smaller, weaker, lower-power devices --- phones, Alexa-type speakers, whatever --- make use of it over your network, using a generic API. There are some issues that come with this. It needs to be hardened, can't leak information from one device to another. Some tasks require storing a lot of state --- like, AI image generation requires uploading a large model, and you want to cache that. If you have, say, two parallel compute cards/servers, you want to use them intelligently, keep the model loaded on one of them insofar as is reasonable, to avoid needing to reload it. Some devices are very latency-sensitive --- like voice recognition --- and some, like image generation, are amenable to batch use, so some kind of priority system is probably warranted. So there are some technical problems to solve.

      But otherwise, the only real option for heavy parallel compute is going to be sending your data out to the cloud. And even if you don't care about the privacy implications or the possibility of a company going under, as I saw some home automation person once point out, you don't want your light switches to stop working just because your Internet connection is out.

      Having per-household self-hosted parallel compute on one node is still probably more-costly than sharing parallel compute among users. But it's cheaper than putting parallel compute on every device.

      Linux has some highly-isolated computing environments like seccomp that might be appropriate for implementing the compute portion of such a server, though I don't know whether it's too-restrictive to permit running parallel compute tasks.

      In such a scenario, you'd have a "household parallel compute server", in much the way that one might have a "household music player" hooked up to a house-wide speaker system running something like mpd or a "household media server" providing storage of media, or suchlike.

  • Amazon employee with no piss breaks listening in on my echo:

    "How many fucking cats does this guy have? Just chose one name and call it that!"

    Edit: "I don't know Jeff, sell him a fucking dr seuss book or something the guys mental."

  • I don't think Google home listens in.

    Because I'd absolutely be disappeared by now if it did.

  • Everything you say to your Echo...

    I don't have an Echo.

161 comments