Sentence structure means that it kind of can't happen in real-time as such, because you would need to wait until potentially the end of the sentence to get words that appear early in the sentence in an accurate and natural-ish translation. If "20 seconds later" is real time, barring run-on sentences, which are much more common in speech than in writing, then I guess.
you would need to wait until potentially the end of the sentence to get words that appear early in the sentence in an accurate and natural-ish translation
Yandex Browser already does this, but to Russian only. It has like 10-15 seconds delay for live streams (at least on Youtube) but it works as well as the auto-generated transcription.