Limited founding-tier pricing $299/mo locked for life with code FOUNDING50 · Claim your spot →

BeginnerClaude Code

Fast mode: when speed matters more than depth

/fast trades a little reasoning for a lot of latency. Here's when that's the right trade.

Fast mode runs Claude Opus 4.6 with optimized output speed. Same intelligence ceiling for short tasks; the difference shows up at the long-task scale. The rules for when to flip the switch.

6 min read
claude-codeperformanceworkflowfast-mode

Claude Code has a /fast toggle. It switches the active model to Claude Opus 4.6 with output speed prioritized — same model family Anthropic used for sub-second iterative coding workflows. For some tasks this is a clear win. For others it's a quiet downgrade.

What /fast actually changes

  • Token output speed: roughly 2x faster end-to-end.
  • Reasoning depth: similar to non-fast for short tasks, but you'll feel the difference on long-horizon planning.
  • Total cost per task: usually lower, because shorter wall-clock = fewer tokens spent on filler / re-explaining.
  • It only works on Opus 4.6 — toggling it doesn't move you to a smaller model. (Common misconception.)

When to flip it on

Editing tasks where you already know what you want. Renaming a variable across files. Adding a single new endpoint that mirrors an existing one. Writing test cases for a function you've already specced. The output is short, the reasoning is shallow, the win is latency.

Tight feedback loops. You're iterating on UI copy, prompt wording, or a regex. You're going to run Claude 20 times in 10 minutes. Cumulative latency adds up; fast mode pays for itself in the first three iterations.

Tool-heavy workflows where the bottleneck is human review. When Claude calls a Bash command, your time-to-next-decision is gated by Claude's response speed, not its thinking depth. Fast mode shortens that gap.

When to leave it off

New design work. When you're sketching how to architect a system, you want Claude reasoning hard. Fast mode shaves milliseconds off the wrong decisions.

Debugging hard bugs. "Why does this race condition only happen on staging?" — give Claude the slow lane and the bigger reasoning budget.

Long planning sessions. /plan mode benefits from depth. Fast mode is fine for executing the plan, but write the plan in non-fast mode.

Reviews that gate ship decisions. /review and /security-review should be slow and careful, not fast and shallow.

A simple rule

If you'd accept a junior engineer doing this task in 10 seconds, use fast mode. If you'd want a senior engineer to spend 5 minutes thinking, don't.

You can toggle mid-session. Common pattern: /plan (off), approve plan, /fast, do the work, /fast (off), /review. Each phase gets the right depth.

What it isn't

Fast mode is not "quality reduced." It's "output prioritized for speed at the same intelligence tier." The difference shows up:

  • On long, branching tasks where depth-first thinking matters.
  • On ambiguous prompts where the right answer requires re-reading context multiple times.
  • On synthesis ("based on these 8 files, what's the pattern?") more than execution ("apply this rename to those 8 files").

Don't use it as a default; pick it deliberately when the task fits.

Where to go next

  • See 10 Claude Code slash commands/fast is one of them. /plan, /review pair naturally with non-fast.
  • For batch-style work where you want fast outputs at scale, look into the Claude API Batch endpoint — same speed/cost tradeoff at the API tier.
  • Pair fast mode with prompt caching for tool-heavy agent loops where the same context repeats.
Share:

Keep learning

Apply this with Waymaker

Get this article surfaced where you work

Inside Waymaker, this article shows up next to the right Signal page — so the lesson lands when you need it, not before.

No credit card required.