Hacker Newsnew | past | comments | ask | show | jobs | submit | davidz's commentslogin


no reasons to put open stack in quotes. code available under Apache-2 https://github.com/livekit/livekit


thanks for the feedback. let us see how we can organize this better for compat with diff LLMs.


It's really cool to see how you guys are using the voice AI stack to overcome language barrier.

(btw I work at LiveKit, so let me know if we could make Agents easier to use for your use case.)


I'm working on a PR now :)


Currently it does: all audio is sent to the model.

However, we are working on turn detection within the framework, so you won't have to send silence to the model when the user isn't talking. It's a fairly straight forward path to cutting down the cost by ~50%.


Working on this for an internal tool - detecting no speech has been a PITA so far. Interested to see how you go with this.


Use the voice activity detector we wrote for Home Assistant. It works very well: https://github.com/rhasspy/pymicro-vad


What if I'm watching TV and use the AI to control it ? It should only react to my voice (a problem I had that forced me to use a wake word).


currently we are using silero VAD to detect speech: https://github.com/livekit/agents/blob/main/livekit-plugins/...

it works well for voice activity; though it doesn't always detect end-of-turn correctly (humans often pause mid-sentence to think). we are working on improving this behavior.


Can I currently put a VAD module in the pipeline and only send audio when there is an active conversation? Feel like just that would solve the problem?


only if you buy enough tokens.


Speculating here, but I would read this as "anycast" as a concept, where each user is connected to the closest location. versus anycast as in the IP protocol. The complexity far outweighs benefits with routing each UDP packet to different servers within the same session.


Cloudflare uses Anycast for the TCP connections they terminate. See e.g. https://blog.cloudflare.com/magic-transit-network-functions/ or ponder DNS-over-HTTPS to 1.1.1.1

I don't think they've talked much about what happens if the connections gets routed to a different PoP mid-stream.


Thanks Roshan!


Arcas is fantastic! They are doing some really great work with WebRTC.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: