Speech-to-text: talking instead of typing, and why it changes everything

You work with Claude Code, ChatGPT, or any LLM. You know the output quality depends on what you give it. More context, better results. Everyone knows this.

The problem is that giving context through a keyboard is slow. Not hard. Slow. Explaining who you are, what your company does, what the project goal is, what you've already tried, what you definitely don't want. You know all of this by heart. But typing it out properly, sentence by sentence, takes time. So you cut. You shorten. You tell yourself "it'll figure it out". And your prompt arrives thin.

That's exactly what was happening to me. Until I stopped typing.

What is speech-to-text?

You talk, your computer writes. That's it. You hold a key, say what you want, release, and the text appears wherever your cursor is. In any app. A prompt, an email, a note, a Slack message.

Dictation has existed for years on Mac and iPhone. But classic tools transcribe word for word. You hesitate, restart a sentence, rephrase mid-thought, and you end up with a text full of duplicates and false starts. Unusable as is.

What changed recently is a new generation of tools that don't just transcribe. They understand what you mean and produce clean text. It's the difference between a stenographer and someone writing for you.

My setup: Wispr Flow

I use Wispr Flow, on Mac and iPhone. Two shortcuts, that's it:

⌥ Right Option held down: you talk, release, text appears
⌘ + ⌥ Right Cmd + Option: hands-free mode, talk as long as you want, tap the combo again to confirm

What makes it different from standard dictation:

Automatic reformulation: you hesitate, restart, it only keeps the clean version. No duplicates, no false starts.
Context adaptation: it adjusts tone based on the app. More formal in email, more direct in Slack.
Custom dictionary: add your industry terms, product names, and it recognizes them.
100+ languages with automatic detection. I switch between French and English without changing a setting.

Why voice dictation changes everything with LLMs

The obvious win is speed. I type at 75-80 words per minute on average. I speak at 200-215. Nearly 3x. For a 300-word context block, that's 4 minutes on a keyboard versus 1 minute 30 out loud.

Keyboard

~80 wpm

Voice

~215 wpm

But the real change isn't speed. It's what you say when friction disappears.

When you type, you filter. Every word has a cost, even a small one. So you keep it short, stick to the essentials, and skip half the context that would have made your prompt actually good.

When you talk, you unroll. You explain the context, give examples, specify what you don't want. You make mistakes, you correct yourself, and that correction is itself information. Saying "actually no, not like that, more like this" gives the LLM two things: what you want and what you reject. Both matter.

It's a brain dump. You empty your head, and the LLM structures it. Instead of spending time organizing your thoughts before typing, you talk, the tool cleans up, and the AI sorts it out.

A concrete example

Yesterday, I wanted Claude Code to help me structure an article. On a keyboard, I would have typed something like:

Write an article about speech-to-text for my blog. Direct tone, no bullshit. Explain why it's useful with LLMs.

Three lines. Fine, but generic.

By dictating, here's what I produced in 40 seconds:

I need to write an article about speech-to-text for my blog cedricrittie.com. My audience is PMs, marketers, people who already use LLMs but type everything on a keyboard. The tone is direct, factual, like I'm explaining something to a colleague over coffee. No bullshit, no lectures. I want to explain that the real gain isn't speed, it's that you give more context when you talk because you self-censor less. And I want to include a tool comparison because people don't know the alternatives. Oh and no emojis, no em dashes, those are my rules.

Same time, 5x more context. The result on the other end is incomparable.

What I dictate

Prompts and context for Claude Code. Obsidian notes when I have an idea or want to brain dump. Emails, messages: first draft dictated, reviewed once. And especially the big context blocks, the ones where you have to explain who you are, what your company does, the project goal. The kind of text you know by heart but never want to type.

What I don't dictate: code itself (Claude handles that) and 3-word messages. And yes, in an open office, you put on headphones or wait until you're alone. It's a real constraint, but not a deal breaker: most heavy context writing happens when you're focused, not in a meeting.

Best AI voice dictation tools in 2026

Wispr Flow is my pick, but it's not the only option. The speech-to-text market has exploded. Here are the ones worth looking at, depending on what you need.

Wispr Flow My pick

AI reformulation Tone adaptation Hands-free mode Custom dictionary

The most polished. Doesn't transcribe what you say, writes what you mean. Command Mode also lets you edit text by voice.

Free (2,000 words/week) · Pro $12/mo (annual) · $15/mo (monthly)

Mac Windows iPhone Android

Spokenly Free unlimited locally

AI reformulation BYOK (GPT, Claude) MCP for devs

Best free option. Unlimited local models, plug in your own API key for cloud. MCP integration is a bonus for devs.

Free (unlimited local) · Pro $9.99/mo (cloud)

Mac iPhone

Voibe

100% offline Local reformulation

Nothing leaves your machine. Best value if privacy is your top priority.

Free (300 words/day) · $4.90/mo · $99 lifetime

Mac

VoiceInk Open source

Local Whisper Per-app config

Built on Whisper, runs locally, no subscription. Raw transcription without reformulation, but honest and fast.

$25 (one-time)

Mac

Typeless

AI reformulation 4 platforms

The only one covering Mac, Windows, iPhone, and Android with reformulation. A solid alternative if you're in a mixed ecosystem.

Free (4,000 words/week) · Pro $12/mo

Mac Windows iPhone Android

Apple Dictation

Zero setup On-device

Free, built-in, on-device on Apple Silicon. No reformulation, but automatic punctuation works well. Good enough for short messages.

Free

Mac iPhone iPad

The real point

There's a lot of talk about prompt engineering. Frameworks, templates, techniques. But the best prompt is the one with the most relevant context. And the most natural way to put context in is to talk.

You speak faster than you type. You give more details when you talk. You self-censor less. And the tool cleans up what your voice produces raw.

The keyboard stays for shortcuts. And context is what makes the difference between a generic result and one that actually sounds like you.

Cédric Rittié