All Topics

LangChain

v1.2

The developer interface for building with LLMs. One API for every model, composable chains, tools, memory, and structured output.

0/50
What is LangChain?

LangChain is an open-source framework that gives you a unified API over every LLM provider, a composable pipeline system (LCEL), and a middleware layer for agents. This article explains what it is, when it helps, when to skip it, and how the ecosystem fits together.

beginner12 min
LangChain v1 Migration Guide

LangChain v1 replaced hooks with middleware, InjectedState with ToolRuntime, and create_react_agent with create_agent. This guide covers the migration order, what breaks silently, and how to test each step — not just what changed.

intermediate14 min
LangChain vs. LangGraph vs. Deep Agents

Three layers of the same stack — not competing frameworks. Here is when each layer earns its place, what it costs you, and how to migrate down when your requirements outgrow it.

beginner14 min
Chat Models & Providers

How to choose a provider, wire it up, configure it, and handle the failures that happen in production. LangChain's ChatModel interface gives you one API for all providers — but the decisions around which provider, what configuration, and how to recover from errors are yours to make.

beginner14 min
Message Types

A field-by-field reference for every message class in LangChain — HumanMessage, AIMessage, SystemMessage, ToolMessage, AIMessageChunk, RemoveMessage, and legacy types. Know what each field does, what breaks when you get it wrong, and how providers differ.

beginner12 min
Messages

Messages are LangChain's transport layer — every model call is a list of typed messages in, one message out. This article covers what each message type carries, how content blocks handle multimodal I/O, and how to manage message growth before it blows your context budget.

intermediate12 min
Standard Content Blocks

LangChain v1 normalizes every provider's output into standard content_blocks — one API for text, reasoning, citations, tool calls, and images across Anthropic, OpenAI, and Google.

intermediate12 min
LCEL: The Pipe Operator

LCEL composes Runnables into chains with |. Understand when to use it, how streaming actually works through each step, and the type contracts that break in production.

beginner12 min
LCEL: Advanced Runnables

RunnableParallel, RunnableBranch, RunnableLambda, and RunnableConfig form the production toolkit for LCEL pipelines. Know when each earns its place, how they fail, and how to compose them without regrets at 3am.

intermediate14 min
Callbacks & Event Hooks

LangChain's callback system hooks into every stage of chain execution — but most teams reach for it when LangSmith, astream_events, or @traceable would serve them better. This article teaches you which mechanism to reach for, how to write production-grade handlers, and how callbacks fail in ways that bring down your whole chain.

intermediate12 min
Model Configuration

The six parameters that separate a production LLM call from a fragile prototype: temperature, max_tokens, timeout, max_retries, rate limiting, and usage tracking. Each has a failure mode that won't surface until you're in production.

intermediate14 min
Batch Processing

batch() parallelizes LLM calls client-side — all requests fire concurrently, results return together. When you need 50%+ cost savings and can tolerate ~1h latency, use your provider's async batch API instead. This article shows you which to pick, how to handle partial failures, and what a production pipeline looks like.

intermediate12 min
Configurable Models

init_chat_model() lets you define a chain once and swap the underlying model at runtime via config — no redeploy needed. This article covers when that's worth the complexity, how config resolution actually works, what configurable_fields='any' silently exposes to attackers, and how to gate cost before your bill 50× overnight.

intermediate14 min
Multimodal Input

Pass images, audio, PDFs, and video to multimodal models using LangChain's standard content blocks. LangChain v1 introduced a provider-agnostic format that works across GPT-4o, Claude, and Gemini — this article covers both the old provider-native format and the new standard, a capability matrix across providers, and a production router that sends each modality to the right model. For deciding when multimodal is the right tool and what it costs, see Multimodal Models.

intermediate12 min
Reasoning Models

Reasoning models (Claude adaptive thinking, OpenAI o3/o4-mini, Gemini 3 Deep Think) spend internal tokens on a scratchpad before producing the final answer. They cost 5-10x more per call than standard models and add latency. This article covers when that tradeoff is worth it, how each provider's API works in 2026, and how LangChain gives you a single parsing interface across all of them.

intermediate12 min
Server-Side Tools

Server-side tools execute on the provider's infrastructure — your code binds them like local tools, but the provider runs them, bills per call, and injects results directly into the model's context. Knowing the cost model, result block types, and budget controls is what separates a demo from production.

intermediate12 min
Local Models

Running models locally with Ollama solves three real problems: data that can't leave your machine, usage patterns that make cloud APIs expensive at scale, and offline operation. This article walks through the decision, the current model landscape, hardware requirements, and the LangChain integration — including reasoning models and structured output.

intermediate12 min
Tool Execution Loop

Tool calling follows a three-step cycle — invoke the model, execute its tool calls, pass results back as ToolMessages, repeat. Understanding this loop is the foundation of every tool-using agent, but shipping it to production requires error handling, token budget awareness, and a clear decision on when to use a manual loop versus create_agent.

intermediate14 min
Prompts That Work

Every production LLM bug traces back to a prompt decision made without thinking about tokens, injection, or testability. This article covers when to use prompt templates, how to budget few-shot examples, how to defend against injection, and how to build a prompt you can actually evaluate.

intermediate14 min
Structured Output

How to choose between function_calling, json_schema, and native structured output. Schema design, validation layers, failure modes, and the cost math for each strategy.

intermediate16 min
Output Parsers

LangChain has four ways to get structured output — and three of them are the wrong choice most of the time. This article maps the decision: when to use with_structured_output(), when JsonOutputParser is still the right tool, and what to do with the legacy parsers you inherited.

intermediate12 min
Tools: Give Your LLM Arms

Tool calling is how LLMs act on the world. This article covers the full stack: how LangChain converts your functions to JSON schemas, how the model decides which tool to call and when, the complete bind_tools → tool_calls → ToolMessage cycle, production patterns like InjectedToolArg and tool artifacts, cost math for tool-heavy agents, and the failure modes that will hit you in production.

intermediate18 min
Provider Extras & Advanced Tools

The extras attribute gives you access to provider-specific capabilities — extended thinking, prompt caching, strict schemas — without breaking your portable LangChain code. This article covers when to use extras, what they actually cost, and what breaks when you do.

intermediate14 min
Dynamic Tools

Static tool registries break in multi-tenant agents — an admin tool visible to a free user is an auth bug. Learn when dynamic filtering is worth the middleware complexity, which of the four strategies to pick, and how LangChain's built-in LLMToolSelectorMiddleware handles the hardest case automatically.

intermediate12 min
Tool Error Handling

LangChain gives you three mechanisms for tool error handling: ToolNode's built-in handle_tool_errors for LangGraph workflows, ToolException for per-tool control, and @wrap_tool_call middleware for cross-cutting production concerns. Knowing which to use — and how to write error messages the model can act on — is what separates a demo from a production agent.

intermediate12 min
Tool Design Patterns

How tool names, descriptions, schemas, and examples influence model selection accuracy. Covers the full production toolkit: namespacing, tool use examples, schema design, error surfaces, scaling with Tool RAG, token cost math, and a concrete eval methodology.

advanced18 min
Parallel Tool Calling

Parallel tool calling lets a model request multiple independent tools in one response instead of one at a time. This article covers when it saves you tokens and latency, when it causes race conditions, how to configure it across providers, and what production failure looks like.

advanced12 min
ToolNode & ToolRuntime

ToolNode is the prebuilt LangGraph node that handles tool execution — parallelism, error routing, and ToolMessage creation included. ToolRuntime is the parameter that gives any tool inside that node access to the agent's state, context, and store without those values leaking into the schema the model sees.

intermediate14 min
Conversation Memory

RunnableWithMessageHistory wraps any LCEL chain with per-session conversation history — but before you use it, you need to know when LangGraph is the better choice, what unbounded history costs you per request, and how to defend against the two failure modes that kill most production chatbots.

intermediate14 min
Managing Message History

Message history is your biggest uncontrolled cost in production agents. This article covers the decision between transient and persistent trimming, when summarization beats deletion, and the four failure modes that produce wrong answers without throwing exceptions.

intermediate14 min
Memory Storage Backends

BaseChatMessageHistory is the interface every storage backend implements, but picking the wrong backend — or ignoring LangGraph checkpointers entirely — will cost you in production. This article covers how to decide, configure, and operate each option.

intermediate12 min
Middleware (v1.0+)

Middleware are hooks that intercept every model and tool call in your agent — without touching your agent's core logic. This article teaches when to use middleware vs. callbacks or graph nodes, how execution order works, and how to stack middleware for production agents.

intermediate10 min
Prebuilt Middleware Catalog

LangChain ships 14 production-ready middleware classes and Deep Agents adds 2 more. This article is organized around decisions: which ones your agent needs, how to order them, and what breaks when you get it wrong.

intermediate14 min
Custom Middleware

Build production middleware that intercepts model calls, gates tool execution, injects dynamic context, and writes state — using node-style hooks for sequential logic and wrap-style hooks when you need control over whether and how many times an operation runs.

advanced16 min
create_agent

create_agent compiles a full agent runtime on LangGraph. Give it a model and tools — it handles the reasoning loop, tool dispatch, middleware, checkpointing, and stopping conditions. This article covers when to use it, what each parameter does, how the loop costs money, and how it fails.

intermediate12 min
System Prompt

System prompts anchor your agent's behavior across every invocation — but only add one when it earns its tokens. This article covers when to omit, how to structure for production, how to cache large prompts at ~90% cost reduction, and where dynamic prompts open injection vectors you must close.

intermediate12 min
Custom State

Production agents carry more than messages. Learn when to extend AgentState, how to design schemas that don't blow up your token budget, and how to avoid the serialization and state-explosion bugs that only show up after you deploy.

intermediate14 min
Dynamic Model Selection

Route agent turns to cheaper models when the task is simple and powerful models when it's complex — using @wrap_model_call to intercept every LLM request and swap the model based on conversation state, user tier, or cost targets. This article starts with whether you should route at all, walks through real cost math, covers the five ways routing silently fails in production, and ends with a 30-day rollout runbook.

advanced18 min
Streaming

Decide whether to stream, pick the right mode for your UI, ship it over HTTP with async streaming, and handle the failures that only appear in production. All patterns use create_agent with version='v2'.

advanced15 min
Agent Structured Output

Before wiring up ProviderStrategy or ToolStrategy, you need to know when structured output will hurt you — streaming breaks, retries compound cost, and over-constrained schemas hallucinate values. This article covers the decision, the cost math, two failure modes most tutorials skip, and the schema patterns that cut retry rates.

intermediate14 min
Document Loaders

How to get text out of PDFs, web pages, Notion, and 200+ other sources — and into your RAG pipeline. Covers loader selection, memory-safe loading, metadata strategy, failure modes, and the production pipeline pattern.

beginner12 min
Text Splitters

LangChain's text splitter API: when to split, which splitter to choose, token-based production splitting, metadata propagation, and the three failure modes that destroy RAG quality.

intermediate14 min
Embedding Models

Embedding models convert text into vectors for semantic search and RAG. This article covers the 2026 model landscape, cost math at scale, production patterns, and the hidden traps — especially the re-embedding trap when you switch models.

intermediate16 min
Vector Stores

How to choose, configure, and operate a vector store for production RAG — covering index types, cost math, failure modes, multi-tenancy, and migration strategy across FAISS, Chroma, pgvector, Qdrant, and Pinecone.

intermediate18 min
Retrievers

Retrievers wrap vector stores in LangChain's Runnable interface — but choosing the wrong one costs latency and money. Decision framework, cost math, and evaluation code for MultiQuery, Parent, SelfQuery, Compression, Ensemble, and custom retrievers.

intermediate16 min
Context Engineering in Agents

Context engineering is the discipline of curating the smallest set of high-signal tokens that maximize the probability of a good outcome. In LangChain, this means deciding what goes into every model call, how tools read and write state, and what happens between steps — using State, Store, and Runtime Context as your three levers.

advanced16 min
Advanced Guardrails

When to add guardrails, how to architect a cost-aware stack across all five middleware hooks, and how to know they work. Covers before_agent input filters, wrap_tool_call for tool-level security, after_agent output safety, false positive management, and guardrail evaluation.

advanced13 min
Runtime & Context Injection

LangChain's Runtime object is a dependency injection system for tools and middleware. Instead of reaching for globals or thread-locals, you pass per-invocation config (user ID, tenant, feature flags) through context_schema and read it anywhere via runtime.context — without exposing it to the model.

intermediate12 min
MCP in Production: LangChain Integration Patterns

When to use MCP vs direct tools, multi-server orchestration, interceptor composition for production, failure handling, and testing patterns for LangChain agents.

advanced14 min
ToolRuntime: Choosing State, Store, and Context

ToolRuntime bundles state, store, context, and streaming into a single typed parameter for tools. This article is about when to use it, which data goes where, and what breaks in production when you choose wrong.

advanced14 min