Provider Extras & Advanced Tools
The extras attribute gives you access to provider-specific capabilities — extended thinking, prompt caching, strict schemas — without breaking your portable LangChain code. This article covers when to use extras, what they actually cost, and what breaks when you do.
Quick Reference
- →extras are namespaced by provider — anthropic={...}, openai={...}, gemini={...} — and silently ignored by other providers
- →Anthropic: cache_control (opt-in, 90% off on read), tool_choice='any', extended thinking (billed as output tokens)
- →OpenAI: strict=True on bind_tools() (schema compilation on first call), parallel_tool_calls=False, seed (best-effort determinism)
- →Gemini 3+: thinking_level='low|medium|high' — NOT thinking_budget (that's Gemini 2.5 legacy)
- →Gemini/OpenAI apply prompt caching automatically — you don't opt in; just put stable content first
- →Thinking tokens bill as output tokens — 5 000 thinking tokens at Opus 4.7 rates costs $0.125 per call
- →Break-even on Anthropic 5-min cache write: 2 repeat calls. Below 2 calls, caching costs more than it saves.
When NOT to Use Provider Extras
Every extra you add makes your code dependent on one provider's API staying stable. If Anthropic changes cache_control semantics, or OpenAI deprecates strict mode, your chain breaks. Before reaching for an extra, ask: does this solve a problem I can't solve without it?
| Use extras when... | Skip extras when... |
|---|---|
| You're already committed to one provider for this use case | You're comparing providers or may switch |
| The feature materially changes cost or accuracy (thinking, caching) | The feature is cosmetic or marginal |
| You're caching large, stable tool schemas (>200 tokens per call) | Your tool schemas are small or change often |
| You need strict schema guarantees for payment or safety-critical flows | Standard bind_tools() already produces correct output |
| You're in a security/medical domain blocked by Gemini's default safety thresholds | The default safety settings don't block your use case |
Three gates: provider lock-in → material gain → fallback tested
If you do use extras, wrap them in a config layer: provider='anthropic' → apply cache_control, provider='openai' → apply strict=True. This makes extras a runtime decision, not a hardcoded one, and lets you test the code path without the extra active.