Home Assistant have a fully local voice assistant experience that's very pluggable and customisable. I believe it uses a fast whisper model for STT and piper for TTS.
You can run it on a raspberry pi (or ideally an N100+), and for the microphone/speaker part, you can make your own or buy their off the shelf voice hardware, which works really well.
Unfortunately I didn't manage to figure out how to make their hardware to work without a HA installation. I'd really love to do that, if anyone has any info on how their protocol works, please do tell.
I looked at their Wyoming docs online but couldn't really see how to even let it find the server, and the ESPhome firmware it runs offered similarly few hints.
You can run it on a raspberry pi (or ideally an N100+), and for the microphone/speaker part, you can make your own or buy their off the shelf voice hardware, which works really well.
https://www.home-assistant.io/voice-pe/