LangChain/Models
Intermediate12 min

Server-Side Tools

Server-side tools execute on the provider's infrastructure — your code binds them like local tools, but the provider runs them, bills per call, and injects results directly into the model's context. Knowing the cost model, result block types, and budget controls is what separates a demo from production.

Quick Reference

  • Anthropic: bind_tools([{'type': 'web_search_20250305', 'name': 'web_search', 'max_uses': 5}])
  • OpenAI: ChatOpenAI(model='gpt-5', use_responses_api=True).bind_tools([{'type': 'web_search'}])
  • LangChain content_blocks types: web_search_call (query) → web_search_result (URLs + content)
  • Cost: $10/1,000 searches ($0.01 each) on both providers, plus token costs for search content
  • Always set max_uses — an agent loop without it can exhaust daily search budget in one session
  • Domain filtering: allowed_domains / blocked_domains on Anthropic web_search_20250305+
  • Multi-turn: pass the full AIMessage — stripped content breaks encrypted_content for citations
  • New: web_search_20260209 adds dynamic filtering via code execution — cuts irrelevant token costs

When NOT to Use Server-Side Tools

Server-side tools solve a real problem — real-time data without infrastructure. But they come with constraints that make them wrong for several production scenarios. Know these before you bind anything.

ScenarioUse Server-Side?Why
Latest news, prices, weatherYesManaged, no scraping infra needed
Latency-critical path (<500ms SLA)NoEach search adds 2–5s of wait
Querying private internal dataNoResults flow through provider servers
Predictable, repeatable queriesNoCache a local call instead ($0 marginal)
Heavy agentic loops (50+ turns)With careSet max_uses or costs compound fast
Regulated data (HIPAA, GDPR)Verify firstCheck provider's data retention policy
Search adds latency you can't control

A single web search call adds 2–5 seconds of provider-side wait before your model continues. In a streaming UI this shows as a silent gap. In a synchronous API it means your SLA math changes. Factor this into every design that touches search.