Cost Management at Team Scale
Claude Code costs compound across a team. npx ccusage, right-sizing models and effort levels, disabling unused MCP servers, and the --max-budget-usd flag are the levers you have.
Quick Reference
- →npx ccusage: daily/monthly/session usage dashboard — reads local JSONL logs, no data leaves your machine
- →Low cache hit rate = expensive sessions — repeated context (CLAUDE.md, system prompt) should cache
- →Disable unused MCP servers: each active server adds tool definitions to every request
- →--max-budget-usd: cap spending for a non-interactive session
- →Right-sizing: Haiku for subagents that don't need Opus reasoning
- →Three cost drivers: model choice, effort level, context size
npx ccusage — Your Usage Dashboard
ccusage is a third-party CLI that reads Claude Code's local JSONL session logs and produces a usage dashboard. It runs entirely on your machine — no data is sent anywhere.
The output shows per-model cost breakdown, total tokens, and cache hit rate. The cache hit rate metric is one of the most actionable numbers in the report.
Claude Code caches repeated context — CLAUDE.md files, system prompts, large file reads that recur across messages. A high cache hit rate means that context costs much less on subsequent requests. A low cache hit rate means your session context is not stable — you are paying full price for the same content repeatedly.