Server-Side Tools
Server-side tools execute on the provider's infrastructure — your code binds them like local tools, but the provider runs them, bills per call, and injects results directly into the model's context. Knowing the cost model, result block types, and budget controls is what separates a demo from production.
Quick Reference
- →Anthropic: bind_tools([{'type': 'web_search_20250305', 'name': 'web_search', 'max_uses': 5}])
- →OpenAI: ChatOpenAI(model='gpt-5', use_responses_api=True).bind_tools([{'type': 'web_search'}])
- →LangChain content_blocks types: web_search_call (query) → web_search_result (URLs + content)
- →Cost: $10/1,000 searches ($0.01 each) on both providers, plus token costs for search content
- →Always set max_uses — an agent loop without it can exhaust daily search budget in one session
- →Domain filtering: allowed_domains / blocked_domains on Anthropic web_search_20250305+
- →Multi-turn: pass the full AIMessage — stripped content breaks encrypted_content for citations
- →New: web_search_20260209 adds dynamic filtering via code execution — cuts irrelevant token costs
When NOT to Use Server-Side Tools
Server-side tools solve a real problem — real-time data without infrastructure. But they come with constraints that make them wrong for several production scenarios. Know these before you bind anything.
| Scenario | Use Server-Side? | Why |
|---|---|---|
| Latest news, prices, weather | Yes | Managed, no scraping infra needed |
| Latency-critical path (<500ms SLA) | No | Each search adds 2–5s of wait |
| Querying private internal data | No | Results flow through provider servers |
| Predictable, repeatable queries | No | Cache a local call instead ($0 marginal) |
| Heavy agentic loops (50+ turns) | With care | Set max_uses or costs compound fast |
| Regulated data (HIPAA, GDPR) | Verify first | Check provider's data retention policy |
A single web search call adds 2–5 seconds of provider-side wait before your model continues. In a streaming UI this shows as a silent gap. In a synchronous API it means your SLA math changes. Factor this into every design that touches search.