They aren't public. Agreed on commercially viable, even in Pakistan, businesses are price sensitive, currently there priced realllly cheap (just because they are small).
Yes! First goal is to get coverage ASAP. I think it will be easy to get dialects in with current model architecture. The hard part will be LLMs catching up on producing consistent text that respects the linguistics as we drill deeper.
Launched them through API. From a business perspective is to get adoption of voice apps in targeted regions. Some companies can now create voice agents etc.
If you have interest/insights in specific languages, would love if you can fill out this form so we can reach out in the future https://forms.gle/XA6nZbmBNK5K7GJv5
Hope so! It is great that it overall has a big impact on making knowledge more accessible (i.e Khan Academy using it to dub their content in minutes instead of weeks). But there are lots of other areas where it applies as well.
At the moment we are focused on making the models available through API so developers can make some cool things. We are actively monitoring to see if there is an opportunity that we will be better positioned to solve.
We are planning on hosting an online hackathon soon, so will suggest these things as ideas!
btw I did try to first make it with Pipecat and was having some annoying windows issues with getting libraries installed for daily etc. so I posted something that was easily reproducible for the tutorial...
Hi! Pipecat maintainer here. There is no Windows restriction for Pipecat, in general. The DailyTransport does not support Windows, but works on WSL. Though, you don't have to use the DailyTransport. Pipecat has interchangeable transport support. You can do all of your testing on a free, P2P WebRTC transport (SmallWebRTCTransport, based on aiortc) without system restrictions.