Hacker Newsnew | past | comments | ask | show | jobs | submit | more toebee's commentslogin

Thanks for the kind words :)))


We'll try to give a high-level overview when we publish the technical report!


Thank you for the kind words! We don't have plans for that yet, but you can always open an issue or RP on Github.


We have a ZeroGPU Space provided by HuggingFace up and running! Test it now on https://huggingface.co/spaces/nari-labs/Dia-1.6B


The examples on your site are impressive, but I'm having trouble getting good results on HF - it's generating a lot of near-silence (often nothing but) and when it does produce speech it bears no resemblance to the audio prompt and only produces parts of the text prompt. Would you suggest any adjustments to the default parameters to improve adherence, or might I expect better results running locally? Thanks!


Thank you for the contribution! We'll be merging PRs and cleaning code up very soon :)


Sorry for the confusion. the license is plain Apache 2.0, and we changed the wording to "intended for research and educational use." The point was, users are free to use it for their use cases, just don't do shady stuff with it.

Thanks for the feedback :)


So is that actually part of the license (making it non-Apache 2.0), or not?


not part of the license!


We are in the progress of fixing it! Thanks for letting us know :)


We use descript audio codec! I’m not sure if DAC works on iOS…


Thank you for the kind words! Dia wasn’t fine tuned on certain speaker, so you will get random voices every time you run it, unless you add a prompt / fix the seed.

The outputs are a bit unstable, might need to add cleaner training data and run longer training sessions. Hopefully we can do something like OAI Whisper and update with better performing checkpoints!


Thank you!! Indeed the script was inspired from a scene in the Office.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: