The Token Budget — What Costs What
Your context window is a finite resource that starts draining before you type a word. Understanding what loads at startup, what accumulates per turn, and how the 1M window changes the math is the foundation of every context engineering decision.
Quick Reference
- →System prompt consumes ~4,200 tokens before anything else
- →Each MCP server injects all its tool schemas upfront — disable irrelevant ones
- →CLAUDE.md files load at full size on every session start
- →Reading a 2,000-line file costs ~15,000–20,000 tokens
- →1M context available on Opus 4.7, Opus 4.6, and Sonnet 4.6 — standard pricing beyond 200K
- →Max/Team/Enterprise: Opus 1M is automatic. Pro: requires opt-in
- →Context quality matters more at 1M — signal gets diluted, not saved
- →Use CLAUDE_CODE_DISABLE_1M_CONTEXT=1 for predictable budgets
The Startup Cost: Before You Type a Word
Every Claude Code session opens with a context window that is already partially filled. These fixed-cost components load automatically as part of the session initialization sequence — you pay for them regardless of what your task is.
| Component | Approx. Token Cost | Notes |
|---|---|---|
| System prompt | ~4,200 tokens | Fixed — cannot be reduced or skipped |
| Environment info | ~280 tokens | Working dir, OS, git status, current date |
| CLAUDE.md (project root) | Full file size | Re-loaded from disk each session |
| CLAUDE.md (subdirectories) | Full file size each | Loaded when Claude enters that directory |
| MCP tool schemas | Variable — can be thousands | Every enabled server injects all its tool definitions |
| Auto memory (MEMORY.md index) | Proportional to index size | Individual memory files load when referenced |
Each enabled MCP server loads its entire tool manifest at session start. A server with 20 tools, each with a name, description, and parameter spec, can easily consume 5,000–10,000 tokens. Three such servers cost more than your entire CLAUDE.md. Use ENABLE_TOOL_SEARCH=1 to enable lazy loading — tools are fetched only when Claude needs them.
Open a fresh session and immediately run /context. The breakdown shows what is loaded and how much context each component is consuming. Do this once per project to understand your baseline before any work begins.