Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Interesting! I'd love to see a thorough comparison with the Amazon Polly service...

https://aws.amazon.com/polly/

Polly is priced at $4 per million characters and the Google WaveNet voices are $16 (compared with the Google non-WaveNet voices, which are also $4).

After listening to a few samples from each service, the voice quality and prosody modeling seem roughly on par between Polly and WaveNet, or at least the differences I heard didn't seem to justify a 4x price multiplier.

But I'd love to hear an informed opinion from someone with more expertise...



A lot of voice generation is cost-center (call center that are outsourced to cheapest location) with short sentences. I doubt industry would pay 4x price multiplier for that use-case.

So in fact WaveNet competes more with voiceover and new use-cases such as voice assistants. Still I don't hear that much difference there today, but maybe WaveNet will improve in the future to human level sooner than the other models.


To me Polly is way behind WaveNet when it comes to realism. Polly is robotic, WN is fluid.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: