Claude Code Cost Control — How to Avoid Surprise Bills
Claude Code charges per token — every message, file read, and command output counts. Without awareness of what drives costs, a single afternoon can generate a surprisingly large bill. This guide explains exactly how pricing works, what makes costs spike, and the concrete strategies to keep your usage efficient.
Quick Reference
- →Use /cost at any time to see your current session and monthly usage
- →Every message includes your CLAUDE.md — keep it concise to reduce per-message cost
- →Use /compact to compress conversation context when it grows too large
- →Start new conversations for new tasks — long conversations are exponentially expensive
- →Reference specific files instead of asking Claude to explore the whole project
- →Use /fast mode for simple tasks that do not need extended thinking
- →Set spending limits in your Anthropic account dashboard
- →Large file reads are the biggest hidden cost driver
How Claude Code Pricing Works
Claude Code uses the Anthropic API under the hood, which charges based on tokens — the basic units of text that the model processes. Every interaction has two token counts: input tokens (what you send to the model) and output tokens (what the model generates back). Output tokens are more expensive than input tokens.
| Token Type | What It Includes | Relative Cost |
|---|---|---|
| Input tokens | Your prompt, CLAUDE.md, conversation history, file contents, command outputs | Base rate |
| Output tokens | Claude's response text, code it generates, commands it writes | ~5x input rate |
| Extended thinking | Internal reasoning tokens when Claude thinks through complex problems | Same as output rate |
Unlike a fresh API call each time, Claude Code maintains conversation context. Each new message includes the entire conversation history as input tokens. Message 1 might cost 1K input tokens, but message 20 of the same conversation might cost 50K input tokens because it includes everything before it.
This is why long conversations are exponentially more expensive than short ones. Each additional message carries the weight of every previous message. A 50-message conversation does not cost 50 times the first message — it costs much more because of the accumulated context.