Claude Code has a /fast toggle. It switches the active model to Claude Opus 4.6 with output speed prioritized — same model family Anthropic used for sub-second iterative coding workflows. For some tasks this is a clear win. For others it's a quiet downgrade.
What /fast actually changes
- Token output speed: roughly 2x faster end-to-end.
- Reasoning depth: similar to non-fast for short tasks, but you'll feel the difference on long-horizon planning.
- Total cost per task: usually lower, because shorter wall-clock = fewer tokens spent on filler / re-explaining.
- It only works on Opus 4.6 — toggling it doesn't move you to a smaller model. (Common misconception.)
When to flip it on
Editing tasks where you already know what you want. Renaming a variable across files. Adding a single new endpoint that mirrors an existing one. Writing test cases for a function you've already specced. The output is short, the reasoning is shallow, the win is latency.
Tight feedback loops. You're iterating on UI copy, prompt wording, or a regex. You're going to run Claude 20 times in 10 minutes. Cumulative latency adds up; fast mode pays for itself in the first three iterations.
Tool-heavy workflows where the bottleneck is human review. When Claude calls a Bash command, your time-to-next-decision is gated by Claude's response speed, not its thinking depth. Fast mode shortens that gap.
When to leave it off
New design work. When you're sketching how to architect a system, you want Claude reasoning hard. Fast mode shaves milliseconds off the wrong decisions.
Debugging hard bugs. "Why does this race condition only happen on staging?" — give Claude the slow lane and the bigger reasoning budget.
Long planning sessions. /plan mode benefits from depth. Fast mode is fine for executing the plan, but write the plan in non-fast mode.
Reviews that gate ship decisions. /review and /security-review should be slow and careful, not fast and shallow.
A simple rule
If you'd accept a junior engineer doing this task in 10 seconds, use fast mode. If you'd want a senior engineer to spend 5 minutes thinking, don't.
You can toggle mid-session. Common pattern: /plan (off), approve plan, /fast, do the work, /fast (off), /review. Each phase gets the right depth.
What it isn't
Fast mode is not "quality reduced." It's "output prioritized for speed at the same intelligence tier." The difference shows up:
- On long, branching tasks where depth-first thinking matters.
- On ambiguous prompts where the right answer requires re-reading context multiple times.
- On synthesis ("based on these 8 files, what's the pattern?") more than execution ("apply this rename to those 8 files").
Don't use it as a default; pick it deliberately when the task fits.
Where to go next
- See 10 Claude Code slash commands —
/fastis one of them./plan,/reviewpair naturally with non-fast. - For batch-style work where you want fast outputs at scale, look into the Claude API Batch endpoint — same speed/cost tradeoff at the API tier.
- Pair fast mode with prompt caching for tool-heavy agent loops where the same context repeats.