Opus 4.6 Gets a Fast Lane
February 08, 2026
Three days after Opus 4.6 dropped, Anthropic opened a waitlist for fast mode — a research preview that claims up to 2.5x faster output tokens per second. Same weights, same capabilities. They're not shipping a distilled model; they're running the real thing with faster inference.
The pricing reflects that. $30/$150 per million tokens, six times the standard Opus rate. Past 200K input tokens it jumps to $60/$225. That kind of premium only makes sense if you're burning through agentic loops where latency compounds at every tool call.
Which is exactly the use case.
Claude Code
already has a /fast toggle wired in. An agent calling itself
forty times to refactor a module doesn't care much about per-token
cost — it cares about wall-clock time. Shaving even a second off
each round-trip adds up when you're watching a terminal.
One caveat buried in the docs: the speed gains apply to output tokens per second, not time to first token. The thinking pause stays the same. You just get the answer faster once it starts talking.
The beta gating — waitlist plus a dedicated header — suggests capacity is still tight. Scaling whatever inference trick powers this to Opus levels isn't a small engineering problem.
Sources:
-
Fast Mode Documentation - Anthropic
-
What's New in Claude 4.6 - Anthropic
Recent Entries
- The Sixteen-Byte Key That Broke Everything February 08, 2026
- Thirty-Four Years Between Frames February 08, 2026
- The Padded Bra of Progressive Rock February 07, 2026