Hacks·4 min read·Apr 28, 2026

5 Ways to Save Tokens in Claude Code

Stop bleeding context. Five low-friction habits that compound across every coding session, every project, every day.

Five low-friction habits that compound across every coding session, every project, every day.

Most Claude Code users blow through context budgets without realizing where it's going. Each tip below is a single habit. Apply all five and you'll get 3-5x more useful conversation on the same prompts.

1. Use .md, not PDF or screenshots

The pain:every PDF you upload burns about 3,000 tokens just to parse. Every screenshot burns about 1,200. That's context you'll never get back, and it stacks across every message.

The fix: open the PDF in Google Docs, then File, Download, Markdown. Drop the .md instead. For screenshots of code, paste the code as plain text.

The stat: 5 PDFs in one session is 15,000 tokens before message one. Same content as .md files is often under 5,000 combined.

2. Add a .claudeignore to every repo

The pain: Claude Code scans your file tree on session start. Without an ignore file, it pulls in node_modules/, .next/, dist/, vendored libraries. Thousands of tokens of pure noise.

The fix: drop a .claudeignore file at your repo root. Same syntax as .gitignore. Tell it to skip build artifacts, dependencies, and anything auto-generated.

node_modules/
.next/
dist/
build/
*.log
.env
vendor/

The stat: a typical Next.js repo without .claudeignore burns 10-20k tokens on first scan. With one in place, often under 2k.

3. Use /compact before context runs hot

The pain: most people blow through 200k of context because they never compact. The agent starts forgetting what you discussed an hour ago.

The fix: when you hit 60-70% context usage (Claude Code shows the bar at the bottom of the terminal), run /compact. It summarizes the conversation in-place, freeing budget without losing the through-line.

The stat: a 4-hour coding session without /compact often loses 30-40% of relevant context to history. Regular /compact keeps you at 90%+ usable budget.

4. Tell the agent to grep before reading

The pain:you ask Claude to "find where we handle auth" and it reads 15 files front-to-back looking for it. Each file read burns hundreds to thousands of tokens. Most of those files were irrelevant.

The fix: tell it to search first, read second. "Grep for authenticateUseracross the repo, then read only the files that match." One grep costs almost nothing. A targeted file read after that costs a fraction of the blind scan.

The stat:a blind "find and read" pass across a medium repo can burn 30-50k tokens. A grep-then-read pass on the same task typically lands under 5k. That's a 6-10x reduction on every investigation.

5. Spawn a subagent for investigation

The pain:you ask Claude to investigate something ("why is this test failing?" or "what's the data flow for this feature?") and it reads file after file in your main conversation. Every file it reads stays in your context window permanently. By the time it finds the answer, you've burned 40-60k tokens of context on files you'll never reference again.

The fix: tell Claude to spawn a subagent. "Use a subagent to investigate why testCheckoutis failing. Report back with the root cause and the file + line number." The subagent runs in its own context window. It can read 50 files, grep the whole repo, trace the call chain. When it's done, only the summary comes back to your main conversation. All that investigation context gets thrown away instead of clogging your window.

The stat:a typical investigation reads 10-20 files at 2-5k tokens each. That's 20-100k tokens burned in your main context. With a subagent, you get the answer in 200-500 tokens. That's a 50-60x reduction on the context cost of any investigation task.

The AI Side Hustle Cookbook

Liked this guide? Shout me a coffee.

$4.99 gets you the full playbook: 50 recipes you can build, ship, and get paid for with Claude Code. Working code in every one. The pricing, the deploy, the pitfalls. Every revision free for life.

Shout me a coffee