Fable 5 is back. It is the smartest model you can point at code right now, and that is exactly why most people are about to waste it. This page is the 1-page token-routing checklist: which model gets which task, the two commands that stop context bloat, the settings that leak tokens in the background, and how to brief Fable so one call does the work of five.
The whole system in one line: use the expensive model at the expensive moment.
The problem nobody talks about
More intelligence does not mean every task deserves the most powerful model. Fable 5 spends more reasoning tokens per request than Sonnet, by design. That is what makes it good. It is also what makes it the worst possible default: when a routine rename or a config edit carries Fable-level reasoning cost, your usage disappears on work Sonnet would have done identically.
The best model is the worst default. Fable should solve your bottlenecks, not autocomplete everything.
1. The 80/20 model router
One routing rule covers almost every session. Switch with /model in Claude Code.
- Sonnet: exploration, scaffolding, routine edits. Reading the codebase, finding files, boilerplate, renames, small fixes, first drafts.
- Fable: architecture, root causes, final verification. Design decisions, the bug three layers deep that Sonnet keeps circling, and the last review pass before you ship.
The mechanism: roughly 80% of coding work is mechanical and any strong model handles it. The other 20% is where model quality actually changes the outcome. Route by that split and your Fable budget lands entirely on the 20% that pays for it.
2. Clear dead context between tasks
New task? Run /clear.
The mechanism people miss: your entire conversation history gets processed again with every new message. Finished task A an hour ago? Its file reads and dead ends are still riding along on every message you send about task B, and you are paying for them each time. /clear resets the window so task B starts clean.
3. Compact with instructions, not on autopilot
Don't wait for the conversation to get giant and auto-compact to fire. Compact early, and tell it what to keep:
/compact Keep decisions, errors, changed files and next steps.Default compaction summarizes blind, and it can flatten the one error message or decision you needed. Passing instructions keeps the load-bearing context sharp while the noise gets compressed. Run it at a natural checkpoint, not when the window is already drowning.
4. Turn off the token leaks
Two settings quietly drain usage in the background:
- Effort level. Drop to low or medium effort for routine work. High effort on a rename is pure waste; save the deep-reasoning budget for the problems that need it.
- Web search and MCP servers you're not using.Every connected tool ships its definitions into context, and the model can decide to call them mid-task. If a server isn't part of today's work, disable it. Same for web search when the task is purely local.
5. One-shot the hard part
When you do call Fable, make the call count. One complete brief beats ten vague turns, because every extra round-trip re-processes the whole conversation at Fable prices. Give it:
- The outcome you want, stated plainly
- The constraints: what it can't touch, what has to stay true
- The definition of done: how you'll both know it worked
Then let it plan the path. Fewer turns. Better results. Less waste.
The 1-page checklist
FABLE 5 TOKEN ROUTING - nicnonac.com
MODEL ROUTER (80/20)
[ ] Sonnet: exploration, scaffolding, routine edits
[ ] Fable: architecture, root causes, final verification
[ ] Switch with /model. Never default to Fable.
CONTEXT HYGIENE
[ ] /clear at the start of every new task
[ ] /compact Keep decisions, errors, changed files and next steps.
[ ] Compact at checkpoints, before the window bloats
KILL THE LEAKS
[ ] Low/medium effort for routine work
[ ] Disable web search when the task is local
[ ] Disable MCP servers you're not using today
FABLE CALLS
[ ] One-shot brief: outcome + constraints + definition of done
[ ] Let it plan the path. Don't drip-feed.
RULE: use the expensive model at the expensive moment.Honest limitations
The router is a default, not a law. Some exploration is genuinely hard (untangling a legacy codebase, say) and deserves Fable from the first message. If Sonnet has circled the same problem twice, stop paying for a third lap and escalate.
And /clear is destructive. If the old conversation holds decisions you need, compact first or note them down. Cleared context is gone.