Intermediate12 min

Server-Side Tools

Server-side tools execute on the provider's infrastructure — your code binds them like local tools, but the provider runs them, bills per call, and injects results directly into the model's context. Knowing the cost model, result block types, and budget controls is what separates a demo from production.

Quick Reference

→Anthropic: bind_tools([{'type': 'web_search_20250305', 'name': 'web_search', 'max_uses': 5}])
→OpenAI: ChatOpenAI(model='gpt-5', use_responses_api=True).bind_tools([{'type': 'web_search'}])
→LangChain content_blocks types: web_search_call (query) → web_search_result (URLs + content)
→Cost: $10/1,000 searches ($0.01 each) on both providers, plus token costs for search content
→Always set max_uses — an agent loop without it can exhaust daily search budget in one session
→Domain filtering: allowed_domains / blocked_domains on Anthropic web_search_20250305+
→Multi-turn: pass the full AIMessage — stripped content breaks encrypted_content for citations
→New: web_search_20260209 adds dynamic filtering via code execution — cuts irrelevant token costs

When NOT to Use Server-Side Tools

Server-side tools solve a real problem — real-time data without infrastructure. But they come with constraints that make them wrong for several production scenarios. Know these before you bind anything.

Scenario	Use Server-Side?	Why
Latest news, prices, weather	Yes	Managed, no scraping infra needed
Latency-critical path (<500ms SLA)	No	Each search adds 2–5s of wait
Querying private internal data	No	Results flow through provider servers
Predictable, repeatable queries	No	Cache a local call instead ($0 marginal)
Heavy agentic loops (50+ turns)	With care	Set max_uses or costs compound fast
Regulated data (HIPAA, GDPR)	Verify first	Check provider's data retention policy

Search adds latency you can't control

A single web search call adds 2–5 seconds of provider-side wait before your model continues. In a streaming UI this shows as a silent gap. In a synchronous API it means your SLA math changes. Factor this into every design that touches search.

How Server-Side Tools Work

With client-side tools, your code executes the tool and returns the result. With server-side tools, you declare the tool in bind_tools(), and when the model decides to use it, the provider's infrastructure executes it — you never see the round-trip. Results are injected into the model's context automatically, and the final response arrives with citations already embedded.

Anthropic Web Search

Anthropic offers two web search tool versions. Use web_search_20250305 for standard search. Use web_search_20260209 (with the code execution tool) for dynamic filtering — Claude writes and runs code to post-process search results before they reach the context window, which reduces token waste on irrelevant HTML. The 20250305 version works with Claude Opus 4.7, Opus 4.6, and Sonnet 4.6.

Sign in to read this article

This is a premium article. Sign in with your Google account to continue.