Integrations/Observability
Intermediate20 min

OpenTelemetry for Agent Tracing

How to instrument LangGraph agents with OpenTelemetry: the Collector architecture you actually need in production, updated GenAI semantic conventions, cost math for sampling decisions, and the failure modes that will bite you before you notice.

Quick Reference

  • OTel is the CNCF standard for vendor-neutral distributed tracing — instrument once, export to Datadog, Grafana Tempo, Jaeger, or any OTLP-compatible backend
  • Production OTel requires the OpenTelemetry Collector — direct app→backend export bypasses tail-based sampling and lacks the buffering you need when a backend goes down
  • Instrument each LangGraph node as a span using gen_ai.operation.name, gen_ai.provider.name, and token-count attributes; wrap tool calls with execute_tool spans
  • The GenAI semantic conventions (2026) define 4 span types: inference, embeddings, retrieval, and execute_tool — each with distinct required and recommended attributes
  • Head-based sampling lives in your Python code (ParentBasedTraceIdRatio); tail-based sampling (always keep errors, sample N% of success) runs in the Collector's tailsamplingprocessor
  • 20 spans/invocation × 1000 req/hr = 480K spans/day — run this math before picking a sampling ratio; skipping it means surprise bills
  • LangSmith now accepts OTel traces natively via LANGSMITH_OTEL_ENABLED=true — configure the Collector to export to both LangSmith and your infra backend in a single pipeline
  • Never use conversation_id, user_id, or prompt content as span attributes in high-traffic systems — cardinality explosion will crash your tracing backend's index

Should You Use OTel?

Before writing a single span, decide whether OTel is the right tool for this problem. The answer depends on what you already have, not on which standard is more vendor-neutral.

Choose your observability strategyOTel OnlyWHEN:Already have Datadog / GrafanaMulti-service infra traces neededLLM calls are one step of manyCOST:Low overheadTRADE-OFF:No prompt inspection o…r eval UILangSmith OnlyWHEN:LLM debugging is the priorityPrompt inspection + evals neededNo existing observability stackCOST:LangSmith pricingTRADE-OFF:No infra-wide distribu…ted tracesBoth (Recommended)WHEN:Use OTel Collector to fan outLANGSMITH_OTEL_ENABLED=trueOne pipeline, two backendsCOST:Collector + LangSmithTRADE-OFF:Slightly more infra to… run

LangSmith now accepts OTel traces natively — "both" costs one Collector config, not two codebases

SignalChoose OTelChoose LangSmithChoose Both
Existing observability stackDatadog, Grafana, or Jaeger already deployedNo tracing infrastructure yetHave both or plan to add LLM debugging to existing infra
What you're debuggingLatency across services, database calls, queue processingPrompt quality, token costs, eval regressionsInfrastructure latency AND prompt/eval issues
Team contextSRE team manages observability; agent is one microservice of manyML team owns the agent end-to-endCross-functional team; both SRE and ML visibility matter
BudgetPay per span volume (backend pricing)Pay per trace in LangSmithBoth costs, but the Collector fan-out means one instrumentation layer
LangSmith and OTel converged in 2025

LangSmith now accepts OpenTelemetry traces natively via LANGSMITH_OTEL_ENABLED=true. You no longer have to choose one or instrument twice. Configure the OTel Collector to export to both your infra backend and LangSmith's OTLP endpoint, and you get infrastructure-wide traces plus LLM-specific debugging in the same pipeline.

  • Skip OTel if your agent is a standalone CLI or batch job with no latency SLAs — LangSmith alone is cheaper to set up
  • Skip OTel if the agent has no external service calls — a single-service agent with no databases or APIs won't benefit from distributed context propagation
  • Use OTel if the agent calls other agents, hits external APIs, or reads from databases you already instrument
  • Use OTel if your SRE team needs to see the agent's spans in the same dashboard as the rest of your backend