How to Manage Context in Claude Code

Context is the single biggest reason Claude Code sessions go sideways. Not the model, not the prompt, not the tools. The conversation window fills up, the agent silently degrades, and you spend the next hour arguing with something that has quietly forgotten why you made half the decisions you made.

/compact slash command in the Claude Code picker

This is the playbook. What to do, what to stop doing, and why /compactisn't the answer you think it is.

The problem nobody talks about

Context isn't a meter at the bottom of your terminal. It's the agent's working memory. The model attends across every token in the window when picking its next move. As the window fills, three things happen, and none of them announce themselves:

Attention dilutes. The signal-to-noise of the prompt drops. Earlier reasoning gets statistically drowned out by later chatter.
Tool results pile up. Every file read, every grep, every test run leaves a transcript. By message 40 you're carrying 30k tokens of stale output the agent can't un-see.
The agent stops admitting it. Claude doesn't crash when context degrades. It doesn't warn you. It just gets worse at the same task it was nailing an hour ago, and you start wondering if the model got nerfed.

Most of this guide is about not getting into that state in the first place. The rest is about what to do once you're already there.

What /compact actually does

Before we trash it, here's the honest mechanism. /compact runs a summarisation pass over the older turns of your conversation, replaces them with a condensed summary block, and keeps the recent messages verbatim. The freed tokens are real. You can keep going past where the window would have filled.

Claude Code explaining what the /compact command does

So /compactis doing exactly what it advertises. The argument here isn't that the command is broken. It's that what it advertises, a free reset button, isn't what you want for continuing a real project.

/compact is a trap for project work

The summary is lossy in exactly the places that matter for project continuity:

The why behind decisions gets compressed out. The summary keeps "we picked option B." It usually drops the hour you spent debugging why option A failed. Two messages later the agent suggests option A again.
File paths and module structure collapse into prose. "Explored the codebase" replaces the actual map the agent had built. Next message it greps the same files again from scratch.
Tool call results disappear. The exact error text, the failing test output, the line numbers all become a vague reference in the summary. If you need to refer back, it's gone.
Compacting a compacted session compounds the drift. Summary-of-summary is where projects quietly derail. Each pass loses more of the original reasoning.
Dead-ends come back as fresh suggestions. The counter-evidence the agent already considered gets pruned. It rediscovers and re-proposes ideas you've already ruled out.

The summary preserves the surface of the conversation. The reasoning is what gets thrown away. And reasoning is the part you came back to the session for.

The fix: handover document, then /clear

The replacement for /compact is a 3-step flow. The current agent (which still has all the context) writes its own handover. You clear. You paste. The next agent starts at 0% context with a document tailored to exactly what you were doing.

Step 1: ask the current agent to write a handover

Before the session degrades further, paste this prompt into your existing session. The agent uses everything still in its window to produce a handover doc and a copy-paste prompt for the next agent.

Handover prompt

Create a handover document for another Claude Code agent. Include: all relevant docs and links, current goals, what we've achieved during this context window, decisions made and why (not just the pick), files touched with a one-line purpose each, and any active blockers.

Then write a copy-paste prompt I can give to the next agent so they start fully oriented. The prompt should reference the handover doc and tell the next agent exactly what to do first.

If you're unsure about anything, ask me questions first. Don't make assumptions.

The agent will usually ask 1-3 clarifying questions before writing. Answer them. The resulting handover doc is dense and specific because the agent generating it still has the full context to draw on.

Step 2: /clear the session

Once the handover doc is saved in the repo (or copied to the clipboard), run /clear. This starts a new session with empty context. The previous session stays on disk and is resumable with /resume if you ever need to refer back.

/clear slash command in the Claude Code picker

This is the move /compact should have been. A real reset, not a summary that drifts.

Step 3: paste the prompt the agent gave you

Paste in the copy-paste prompt the previous session wrote for you. The new agent reads the handover doc, re-orients in one or two messages, and you're back at full speed. Starting context: ~2% instead of 70%.

Why this beats /compact: the handover is written by the agent that still has the reasoning intact, structured for a fresh agent to consume, and lives in a file you can re-read later. /compactwrites a generic summary, leaves it inline in a context window that's still half-full of stale tool output, and gives you no artefact to revisit.

Side-by-side in practice: a post-/compact session typically burns 3-5 messages re-asking what files matter and second-guessing decisions the summary mangled. A handover + /clear lands the next step in 1-2 messages.

Spawn subagents for investigation

The other half of the context fix is preventing it from filling up in the first place. The biggest accidental context burner is investigation work, the kind of task that reads 10+ files to answer one question.

When you ask Claude to investigate something in your main conversation ("why is this test failing?", "trace the data flow for this feature"), every file it reads stays in your context window permanently. By the time the answer comes back you've burned 40-60k tokens of context on files you'll never reference again.

The fix: push it into a subagent.

Use a subagent to investigate why testCheckout is failing. Report back with the root cause, the file and line number, and a 2-line proposed fix.

The subagent runs in its own context window. It can grep the repo, read 50 files, trace the call chain, run the test. When it's done, only the summary report comes back to your main conversation. Everything else gets thrown away with the subagent's context.

Rule of thumb:if a task reads more than 5 files and the file contents aren't the deliverable, it belongs in a subagent. A typical investigation is 10-20 file reads at 2-5k tokens each. That's 20-100k tokens of bloat avoided per task.

Grep before read, always

The cheapest line of investigation discipline. When you (or the agent) need to find something in the repo, grep first, read second.

A blind "find where we handle auth" that opens 15 files looking for it burns thousands of tokens per file. The same task as "grep for authenticateUseracross the repo, then read only the files that match" lands in a fraction of that. One grep call costs almost nothing. A targeted read after the grep costs a fraction of the blind scan.

Add this as a standing instruction in your prompt or in CLAUDE.md:

Before reading multiple files to find a symbol or pattern, run grep across the repo first. Only read the files that match.

Typical reduction: 6-10x on every investigation task.

Plan files are your real memory

The conversation is not where state should live. The conversation is volatile. Files are not.

For any project longer than one session, keep a PLAN.md or NOTES.md in the repo. Update it as you go. The agent writes to it when decisions are made, you write to it when context shifts. When you start a fresh session, the first thing you do is read it.

Why this beats relying on the conversation:

Survives /compact. Survives crashes. Survives starting a fresh session.
Stays diffable. You can git log the plan and see how thinking evolved.
The agent can re-read it on demand. It's the cheapest form of long-term memory available.
Forces you to externalise the through-line, which is the part summaries always lose.

Treat the conversation as scratchpad. Treat the plan file as memory.

.claudeignore and what enters context

The simplest context win: stop letting noise enter the window in the first place.

Claude Code scans your file tree on session start. Without an ignore file, it pulls in node_modules/, .next/, dist/, build artefacts, vendored libraries. Thousands of tokens of pure noise before you've typed a single message.

Drop a .claudeignore at your repo root. Same syntax as .gitignore:

node_modules/
.next/
dist/
build/
*.log
.env
vendor/
*.min.js
coverage/

A typical Next.js repo without .claudeignoreburns 10-20k tokens on first scan. With one in place, often under 2k. That's 10-20k tokens of headroom you keep for the work that matters.

When /compact is actually fine

Not every session is a project. /compacthas a real fit, it's just not the one most people use it for.

Exploratory chats. You're feeling out an idea, asking questions, learning a library. There's no through-line to preserve because you haven't built one yet. Compact freely.
Short throwaway tasks. Quick one-shot scripts, single-file refactors, tasks where the deliverable is the diff and the conversation is disposable.
Mid-task when you're definitely continuing in this same session. If you're 80% through something and just need 30% more budget to finish, compact, finish, close the session. Don't compact, finish, then compact again, then continue tomorrow.

The rule: compact is fine when the conversation itself is disposable. The moment the conversation becomes the project's reasoning record, treat it like one and hand it off properly.

Quick reference

The context-discipline checklist, in order of impact:

Plan file in the repo. PLAN.md or NOTES.md. State lives here, not in the conversation.
Subagents for investigation. Any task that reads 5+ files and isn't about the file contents.
Grep before read. Always. Standing instruction in CLAUDE.md.
.claudeignore on every repo. Kill noise before it enters the window.
Skip /compact for project work. Have the agent write a handover doc, then /clear and paste the prompt it generated.
Compact freely when the conversation is disposable. Exploration, one-shots, short tasks.

Apply these and the question of "is my session degrading?" mostly stops coming up. You won't get there because your front door is closed, your investigations don't pollute the room, and your memory lives in a file the agent can re-read on demand.