How is this different than me using the voice to speech feature on my iPhone or Mac that is built in, and free? I can talk into voice memos as well and get a full transcript even from crazy long files
The main differences are transcription quality and what happens after the transcript is generated.
Utter uses GPT-4o Transcribe by default for cloud transcription, and in my experience it’s best in class. The gap is most obvious on names, niche terminology, and technical vocabulary. I use it a lot for prompting coding agents, and I've found Apple’s built-in dictation and most other apps don't come close in terms of accuracy.
It also adds a custom post-processing step. So instead of ending up with a raw transcript, you can record a long, messy voice note and have it turned into a clean, structured markdown notes.
If you want to test the accuracy difference yourself, try dictating this with both Apple dictation and ChatGPT web (uses same model) and compare the output:
“My FastAPI service uses Pydantic, Celery, Redis, and SQLAlchemy, but the async worker is deadlocking when a background task retries after a Postgres connection pool timeout.”
How is this different than me using the voice to speech feature on my iPhone or Mac that is built in, and free? I can talk into voice memos as well and get a full transcript even from crazy long files
Thanks