Cédric Rittié

← Back to blog
toolsproductivity

Speech-to-text: talking instead of typing, and why it changes everything

A comparison of the best AI voice dictation tools (Wispr Flow, Spokenly, Voibe) and why speaking your prompts produces better results than typing.

You work with Claude Code, ChatGPT, or any LLM. You know the output quality depends on what you give it. More context, better results. Everyone knows this.

The problem is that giving context through a keyboard is slow. Not hard. Slow. Explaining who you are, what your company does, what the project goal is, what you've already tried, what you definitely don't want. You know all of this by heart. But typing it out properly, sentence by sentence, takes time. So you cut. You shorten. You tell yourself "it'll figure it out". And your prompt arrives thin.

That's exactly what was happening to me. Until I stopped typing.

What is speech-to-text?

You talk, your computer writes. That's it. You hold a key, say what you want, release, and the text appears wherever your cursor is. In any app. A prompt, an email, a note, a Slack message.

Dictation has existed for years on Mac and iPhone. But classic tools transcribe word for word. You hesitate, restart a sentence, rephrase mid-thought, and you end up with a text full of duplicates and false starts. Unusable as is.

What changed recently is a new generation of tools that don't just transcribe. They understand what you mean and produce clean text. It's the difference between a stenographer and someone writing for you.

My setup: Wispr Flow

I use Wispr Flow, on Mac and iPhone. Two shortcuts, that's it:

  • Right Option held down: you talk, release, text appears
  • + Right Cmd + Option: hands-free mode, talk as long as you want, tap the combo again to confirm

What makes it different from standard dictation:

  • Automatic reformulation: you hesitate, restart, it only keeps the clean version. No duplicates, no false starts.
  • Context adaptation: it adjusts tone based on the app. More formal in email, more direct in Slack.
  • Custom dictionary: add your industry terms, product names, and it recognizes them.
  • 100+ languages with automatic detection. I switch between French and English without changing a setting.

Why voice dictation changes everything with LLMs

The obvious win is speed. I type at 75-80 words per minute on average. I speak at 200-215. Nearly 3x. For a 300-word context block, that's 4 minutes on a keyboard versus 1 minute 30 out loud.

Keyboard
~80 wpm
Voice
~215 wpm

But the real change isn't speed. It's what you say when friction disappears.

When you type, you filter. Every word has a cost, even a small one. So you keep it short, stick to the essentials, and skip half the context that would have made your prompt actually good.

When you talk, you unroll. You explain the context, give examples, specify what you don't want. You make mistakes, you correct yourself, and that correction is itself information. Saying "actually no, not like that, more like this" gives the LLM two things: what you want and what you reject. Both matter.

It's a brain dump. You empty your head, and the LLM structures it. Instead of spending time organizing your thoughts before typing, you talk, the tool cleans up, and the AI sorts it out.

A concrete example

Yesterday, I wanted Claude Code to help me structure an article. On a keyboard, I would have typed something like:

Write an article about speech-to-text for my blog. Direct tone, no bullshit. Explain why it's useful with LLMs.

Three lines. Fine, but generic.

By dictating, here's what I produced in 40 seconds:

I need to write an article about speech-to-text for my blog cedricrittie.com. My audience is PMs, marketers, people who already use LLMs but type everything on a keyboard. The tone is direct, factual, like I'm explaining something to a colleague over coffee. No bullshit, no lectures. I want to explain that the real gain isn't speed, it's that you give more context when you talk because you self-censor less. And I want to include a tool comparison because people don't know the alternatives. Oh and no emojis, no em dashes, those are my rules.

Same time, 5x more context. The result on the other end is incomparable.

What I dictate

Prompts and context for Claude Code. Obsidian notes when I have an idea or want to brain dump. Emails, messages: first draft dictated, reviewed once. And especially the big context blocks, the ones where you have to explain who you are, what your company does, the project goal. The kind of text you know by heart but never want to type.

What I don't dictate: code itself (Claude handles that) and 3-word messages. And yes, in an open office, you put on headphones or wait until you're alone. It's a real constraint, but not a deal breaker: most heavy context writing happens when you're focused, not in a meeting.

Best AI voice dictation tools in 2026

Wispr Flow is my pick, but it's not the only option. The speech-to-text market has exploded. Here are the ones worth looking at, depending on what you need.

Wispr Flow My pick
AI reformulation Tone adaptation Hands-free mode Custom dictionary
The most polished. Doesn't transcribe what you say, writes what you mean. Command Mode also lets you edit text by voice.
Free (2,000 words/week) · Pro $12/mo (annual) · $15/mo (monthly)
Mac Windows iPhone Android
Spokenly Free unlimited locally
AI reformulation BYOK (GPT, Claude) MCP for devs
Best free option. Unlimited local models, plug in your own API key for cloud. MCP integration is a bonus for devs.
Free (unlimited local) · Pro $9.99/mo (cloud)
Mac iPhone
100% offline Local reformulation
Nothing leaves your machine. Best value if privacy is your top priority.
Free (300 words/day) · $4.90/mo · $99 lifetime
Mac
VoiceInk Open source
Local Whisper Per-app config
Built on Whisper, runs locally, no subscription. Raw transcription without reformulation, but honest and fast.
$25 (one-time)
Mac
AI reformulation 4 platforms
The only one covering Mac, Windows, iPhone, and Android with reformulation. A solid alternative if you're in a mixed ecosystem.
Free (4,000 words/week) · Pro $12/mo
Mac Windows iPhone Android
Zero setup On-device
Free, built-in, on-device on Apple Silicon. No reformulation, but automatic punctuation works well. Good enough for short messages.
Free
Mac iPhone iPad

The real point

There's a lot of talk about prompt engineering. Frameworks, templates, techniques. But the best prompt is the one with the most relevant context. And the most natural way to put context in is to talk.

You speak faster than you type. You give more details when you talk. You self-censor less. And the tool cleans up what your voice produces raw.

The keyboard stays for shortcuts. And context is what makes the difference between a generic result and one that actually sounds like you.