ultrathink — Allocating Extended Reasoning Budget
Extended thinking gives Claude a scratchpad to reason before responding. The thinking budget ladder goes from 'think' to 'ultrathink' — and knowing where on that ladder your problem lives is the difference between a great answer and an expensive one.
Quick Reference
- →think: ~4,000 thinking tokens — initial reasoning level
- →think hard / megathink: ~10,000 tokens — deeper consideration
- →think harder / ultrathink: ~31,999 tokens — maximum reasoning budget
- →Invoke by writing the keyword anywhere in your prompt
- →Thinking tokens are not shown in the response but are billed as output tokens
- →Opus 4.7 uses adaptive thinking — model auto-determines depth based on complexity
- →Use ultrathink for architecture decisions, complex debugging, security analysis, novel algorithms
- →Never for routine tasks — ultrathink on a simple refactor is expensive waste
What the Thinking Budget Is
Extended thinking gives Claude a private scratchpad — a token budget it spends on internal reasoning before producing a response. This thinking is not shown in the chat, but it fundamentally changes what Claude can do on hard problems: it can explore approaches, check its own logic, identify contradictions, and backtrack before committing to an answer.
Without extended thinking, Claude produces its answer in a single forward pass. With extended thinking, it reasons iteratively — the equivalent of working through a problem on paper before presenting the solution. On complex problems, this produces materially better answers.
Extended thinking tokens are not free — they are billed at output token rates, which are higher than input token rates. On Opus 4.7, output tokens cost significantly more than input tokens. Ultrathink at 32K thinking tokens is expensive. Use it for problems where the extra reasoning budget justifiably changes the outcome.