Intermediate18 min

LangSmith: Production Observability for Agents

LangSmith gives you full observability into every LLM call, tool invocation, and state transition in your agent — automatically, with no code changes. This article covers how to use it, how much it costs, how to protect PII, and how to turn production traces into evaluation datasets.

Quick Reference

→Use LANGSMITH_TRACING=true and LANGSMITH_API_KEY (not the older LANGCHAIN_* names, which are deprecated)
→Every LLM call, tool invocation, and LangGraph node is captured as a hierarchical span with tokens, latency, and exact I/O
→Tag runs with metadata (user_id, agent_version, feature_flag) to filter and aggregate traces in dashboards
→LangSmith free tier: 5K traces/month; Plus: $39/seat + $2.50–$5.00 per 1K traces — budget before you ship to prod
→Scrub PII from traces using hide_inputs/hide_outputs on the @traceable decorator or via RunnableConfig
→Production traces are your best source of eval data — annotate, export to datasets, and gate deploys on eval scores
→LangSmith now supports OpenTelemetry export (pip install langsmith[otel]) — vendor-neutral escape hatch if you need it

Should You Use LangSmith?

LangSmith is the default for LangChain/LangGraph — for other stacks, start with Langfuse

LangSmith is the right default if you are using LangChain or LangGraph. Its auto-instrumentation is the tightest available for those frameworks — every node, edge, and state transition in a LangGraph is automatically captured as a span. The UI is built around the mental model of a trace tree, which maps directly to how LangGraph execution actually works.

When to use something else

If your agent does not use LangChain or LangGraph — say you are calling the Anthropic API directly or using Pydantic AI — Langfuse is a stronger default. It is open source (MIT), framework-agnostic, self-hostable, and at 1M traces/month costs roughly a third of LangSmith Cloud. Arize Phoenix is the self-hosted alternative if data residency is a hard requirement.

If you need vendor-neutral tracing that works across tools, both LangSmith and Langfuse now support OpenTelemetry. Enable LangSmith's OTEL exporter with pip install langsmith[otel] and configure your OTEL endpoint. You can migrate between backends without re-instrumenting your code.

Setup & Your First Trace

Two env vars, full observability

As of the langsmith Python SDK v0.3+, the canonical variable names are LANGSMITH_TRACING, LANGSMITH_API_KEY, and LANGSMITH_PROJECT. The older LANGCHAIN_TRACING_V2 and LANGCHAIN_API_KEY still work for backward compatibility but are no longer documented in official quickstarts.

Reading a Trace: The Debugging Workflow

The trace tree is more useful than a log file because it preserves causality. Each span has a parent_run_id that links it to its parent, so you can see exactly which LLM call triggered which tool call, and which state the agent was in at that moment. Here is the workflow for debugging a failing agent run:

Sign in to read this article

This is a premium article. Sign in with your Google account to continue.