ultrathink — Allocating Extended Reasoning Budget

Extended thinking gives Claude a scratchpad to reason before responding. The thinking budget ladder goes from 'think' to 'ultrathink' — and knowing where on that ladder your problem lives is the difference between a great answer and an expensive one.

Quick Reference

→think: ~4,000 thinking tokens — initial reasoning level
→think hard / megathink: ~10,000 tokens — deeper consideration
→think harder / ultrathink: ~31,999 tokens — maximum reasoning budget
→Invoke by writing the keyword anywhere in your prompt
→Thinking tokens are not shown in the response but are billed as output tokens
→Opus 4.7 uses adaptive thinking — model auto-determines depth based on complexity
→Use ultrathink for architecture decisions, complex debugging, security analysis, novel algorithms
→Never for routine tasks — ultrathink on a simple refactor is expensive waste

What the Thinking Budget Is

Extended thinking gives Claude a private scratchpad — a token budget it spends on internal reasoning before producing a response. This thinking is not shown in the chat, but it fundamentally changes what Claude can do on hard problems: it can explore approaches, check its own logic, identify contradictions, and backtrack before committing to an answer.

Without extended thinking, Claude produces its answer in a single forward pass. With extended thinking, it reasons iteratively — the equivalent of working through a problem on paper before presenting the solution. On complex problems, this produces materially better answers.

Thinking Tokens Are Billed as Output Tokens

Extended thinking tokens are not free — they are billed at output token rates, which are higher than input token rates. On Opus 4.7, output tokens cost significantly more than input tokens. Ultrathink at 32K thinking tokens is expensive. Use it for problems where the extra reasoning budget justifiably changes the outcome.

The Reasoning Budget Ladder

The thinking budget escalates through a keyword ladder. Each level allocates more reasoning tokens, which means more exploration but also more cost and latency.

Adaptive Thinking in Opus 4.7

Opus 4.7 introduced adaptive thinking — the model auto-determines reasoning depth based on the complexity of the question. Instead of allocating a fixed budget based on the keyword, the model evaluates the problem and decides how much thinking to invest. This does not make the keywords obsolete — they still signal intent and set a ceiling — but the model may think less than the budget allows if the problem doesn't warrant it.

Sign in to read this article

This is a premium article. Sign in with your Google account to continue.