All Topics

Integrations

LangSmith for observability, OpenTelemetry for tracing, MCP for infinite tools, voice and multimodal agents, and real-time streaming patterns.

0/18
LangSmith: Production Observability for Agents

LangSmith gives you full observability into every LLM call, tool invocation, and state transition in your agent — automatically, with no code changes. This article covers how to use it, how much it costs, how to protect PII, and how to turn production traces into evaluation datasets.

intermediate18 min
LangSmith Automation Rules

Event-driven triggers for production traces — filter by error, latency, metadata, or feedback scores, and route matches to annotation queues, datasets, webhooks, or online evaluators. Everything is configured in the LangSmith UI.

advanced12 min
LangSmith Datasets & Experiments

Build versioned eval datasets from production traces, write evaluators that actually measure correctness, run experiments to prove prompt changes work, and gate deploys on regression — the full LangSmith evaluation workflow.

advanced14 min
Distributed Tracing

Cross-service trace propagation for multi-service agents — decide if you need it, choose between LangSmith-native and OTel approaches, link traces across HTTP boundaries, and control costs with sampling.

advanced12 min
LangSmith MCP Server

The LangSmith MCP Server exposes your entire observability workspace as callable tools via Model Context Protocol — query traces with FQL, manage datasets, push prompts, and check billing from Claude Desktop, Cursor, or any custom agent without leaving your editor.

advanced12 min
OpenTelemetry for Agent Tracing

How to instrument LangGraph agents with OpenTelemetry: the Collector architecture you actually need in production, updated GenAI semantic conventions, cost math for sampling decisions, and the failure modes that will bite you before you notice.

intermediate20 min
Logging, Metrics & Alerting

Learn when to build custom observability versus use a managed platform, then build it right: structured logging with correlation IDs, Prometheus metrics with cardinality discipline, and rate-of-change alerts that catch regressions before your users do.

intermediate13 min
RAG: When to Use It, How to Build It, and How It Breaks

The first RAG decision is whether to use RAG at all — with 200K+ token context windows, it's a choice, not a given. This article covers the RAG-vs-long-context decision framework with cost math, building an indexing and retrieval pipeline, evaluation with concrete thresholds, production failure modes, monitoring, and a production-shaped LangGraph reference implementation.

intermediate22 min
MCP: Connect Any Tool

When MCP earns its overhead over inline tools, how to connect local and remote servers in LangChain, how to build your own server with FastMCP, and the four failure modes that trip up production deployments.

intermediate14 min
MCP Interceptors

MCP interceptors are async middleware for tool calls — wrapping every MCP invocation with auth, retry, logging, and access control. This article covers the real API (MCPToolCallRequest, handler, override), when to use interceptors vs alternatives, core patterns with correct imports, multi-server routing via server_name, failure modes when interceptors break, and testing strategies.

advanced15 min
MCP Resources & Prompts

Resources, Prompts, and Elicitation are the three MCP primitives engineers most often skip. Here's what they're actually for, when to reach for each, and what breaks in production when you ignore them.

intermediate12 min
MCP Authentication

OAuth 2.1 + PKCE is the MCP spec requirement for HTTP servers — not a suggestion. Learn the discovery flow, per-user delegated auth via interceptors, what the spec forbids (token passthrough, audience-skipping), and when you need auth at all.

advanced12 min
Building Production MCP Servers

Build, version, deploy, and monitor production MCP servers with both the TypeScript SDK and FastMCP. Covers the build-vs-buy decision, schema versioning, deployment cost math, the gateway pattern for multi-server architectures, and three-tier health monitoring — because only 9% of remote MCP endpoints are fully healthy in the wild.

advanced18 min
Voice Agents

Voice agents cost 10-50x more per interaction than text agents and introduce failure modes that don't exist in chat. This article helps you decide whether voice is worth the complexity, choose the right architecture, understand the real costs, anticipate production failures, and evaluate whether your voice agent actually works.

advanced14 min
Multimodal Agents (Vision & Files)

When to use vision models vs. dedicated parsers, real cost math using Anthropic's actual token formula, how vision fails on financial docs, model tiering for 50–90% cost savings, image generation with gpt-image-1.5, and a 30-day deployment runbook.

advanced20 min
Voice Agents Deep Dive

Deep dive into production voice agent pipelines in 2026. Covers the pipeline-vs-Realtime-API architecture decision, updated STT and TTS provider choices (Deepgram Nova-3, ElevenLabs Flash v2.5, Cartesia Sonic 3), production-grade barge-in handling, cost modeling, and when to use LiveKit Agents Framework instead of rolling your own pipeline.

advanced15 min
WebSocket & SSE for Agents

How to choose between Server-Sent Events and WebSocket for AI agent communication, with production-ready FastAPI code using the native EventSourceResponse API, authentication patterns, backpressure handling, and scaling strategies.

intermediate18 min
Multimodal Pipelines

Multimodal pipelines add genuine value when layout, speaker identity, or visual content cannot be captured by text extraction alone — and add cost and hallucination risk when they can. This article covers the coordinator pattern, how to compute real costs, and how to defend against the specific failures that take these systems down in production.

advanced14 min