So I have the api working as in I can send audio files and get text back but what I am looking for is a robust way to have streaming functionality. For example, if there is a small duration of silence it should stop recording and send the audio to api etc.
Dunno, but this guy (all about ai) builds one with 'faster-whisper', so perhaps you can get a few pointers there? I believe he chunks the Audio on silence. He have a few other speech2x videos. Have fun. https://youtu.be/k6nIxWGdrS4