Hacks·4 min read·Jul 2, 2026

The Fable 5 Token-Routing Checklist: Use the Expensive Model at the Expensive Moment

Fable 5 is back, and most people will burn their limit in a day by defaulting to it for everything. This is the 1-page routing system: Sonnet handles exploration, scaffolding, routine edits. Fable gets architecture, root causes, final verification. Plus the four habits that stop the silent drain: /clear between tasks, /compact with instructions, effort control, and shutting off the web search and MCP servers you're not using.

Fable 5 is back. It is the smartest model you can point at code right now, and that is exactly why most people are about to waste it. This page is the 1-page token-routing checklist: which model gets which task, the two commands that stop context bloat, the settings that leak tokens in the background, and how to brief Fable so one call does the work of five.

The whole system in one line: use the expensive model at the expensive moment.

The problem nobody talks about

More intelligence does not mean every task deserves the most powerful model. Fable 5 spends more reasoning tokens per request than Sonnet, by design. That is what makes it good. It is also what makes it the worst possible default: when a routine rename or a config edit carries Fable-level reasoning cost, your usage disappears on work Sonnet would have done identically.

The best model is the worst default. Fable should solve your bottlenecks, not autocomplete everything.

1. The 80/20 model router

One routing rule covers almost every session. Switch with /model in Claude Code.

  • Sonnet: exploration, scaffolding, routine edits. Reading the codebase, finding files, boilerplate, renames, small fixes, first drafts.
  • Fable: architecture, root causes, final verification. Design decisions, the bug three layers deep that Sonnet keeps circling, and the last review pass before you ship.

The mechanism: roughly 80% of coding work is mechanical and any strong model handles it. The other 20% is where model quality actually changes the outcome. Route by that split and your Fable budget lands entirely on the 20% that pays for it.

2. Clear dead context between tasks

New task? Run /clear.

The mechanism people miss: your entire conversation history gets processed again with every new message. Finished task A an hour ago? Its file reads and dead ends are still riding along on every message you send about task B, and you are paying for them each time. /clear resets the window so task B starts clean.

3. Compact with instructions, not on autopilot

Don't wait for the conversation to get giant and auto-compact to fire. Compact early, and tell it what to keep:

/compact Keep decisions, errors, changed files and next steps.

Default compaction summarizes blind, and it can flatten the one error message or decision you needed. Passing instructions keeps the load-bearing context sharp while the noise gets compressed. Run it at a natural checkpoint, not when the window is already drowning.

4. Turn off the token leaks

Two settings quietly drain usage in the background:

  • Effort level. Drop to low or medium effort for routine work. High effort on a rename is pure waste; save the deep-reasoning budget for the problems that need it.
  • Web search and MCP servers you're not using.Every connected tool ships its definitions into context, and the model can decide to call them mid-task. If a server isn't part of today's work, disable it. Same for web search when the task is purely local.

5. One-shot the hard part

When you do call Fable, make the call count. One complete brief beats ten vague turns, because every extra round-trip re-processes the whole conversation at Fable prices. Give it:

  • The outcome you want, stated plainly
  • The constraints: what it can't touch, what has to stay true
  • The definition of done: how you'll both know it worked

Then let it plan the path. Fewer turns. Better results. Less waste.

The 1-page checklist

FABLE 5 TOKEN ROUTING - nicnonac.com

MODEL ROUTER (80/20)
[ ] Sonnet: exploration, scaffolding, routine edits
[ ] Fable: architecture, root causes, final verification
[ ] Switch with /model. Never default to Fable.

CONTEXT HYGIENE
[ ] /clear at the start of every new task
[ ] /compact Keep decisions, errors, changed files and next steps.
[ ] Compact at checkpoints, before the window bloats

KILL THE LEAKS
[ ] Low/medium effort for routine work
[ ] Disable web search when the task is local
[ ] Disable MCP servers you're not using today

FABLE CALLS
[ ] One-shot brief: outcome + constraints + definition of done
[ ] Let it plan the path. Don't drip-feed.

RULE: use the expensive model at the expensive moment.

Honest limitations

The router is a default, not a law. Some exploration is genuinely hard (untangling a legacy codebase, say) and deserves Fable from the first message. If Sonnet has circled the same problem twice, stop paying for a third lap and escalate.

And /clear is destructive. If the old conversation holds decisions you need, compact first or note them down. Cleared context is gone.

The AI Side Hustle Cookbook

Liked this guide? Shout me a coffee.

$4.99 gets you the full playbook: 50 recipes you can build, ship, and get paid for with Claude Code. Working code in every one. The pricing, the deploy, the pitfalls. Every revision free for life.

Shout me a coffee