Audio synthesis speed is one thing, but is the output _intelligible to a human_ at 1,000wpm? That's the sort of thing Eloquence is being used for, according to the article.
TTS has no intelligence bud. Its only something that transforms text to audio. And that is all that we are talking about here. neither the article or anyone else was discussing the whole stt > llm > tts pipeline.