LangGraph

v1.1

The execution engine for agents. State machines, persistence, human-in-the-loop, streaming, and graph-based control flow.

0/25

★What is LangGraph?

LangGraph is a state-machine runtime for building agents with branching, cycles, persistence, and human-in-the-loop. This article helps you decide whether LangGraph fits your use case, explains the core model, and maps the 6-chapter learning path.

beginner8 min

State: The Agent's Brain

State is the contract between your nodes. Get the shape wrong — unbounded lists, unserializable objects, missing reducers — and checkpointing, streaming, and debugging all break at once.

intermediate14 min

Nodes

Nodes are the units of computation in LangGraph. This article covers every node type — function, ToolNode, Command, deferred — plus node-level caching, pre/post model hooks, failure modes, and how to test nodes in isolation.

intermediate18 min

Edges & Routing

LangGraph has four edge primitives — add_edge, add_conditional_edges, Command, and Send. Picking the wrong one means lost state updates, infinite loops, or fan-out patterns that silently misbehave.

intermediate14 min

create_react_agent → create_agent Migration

create_react_agent is deprecated in LangGraph v1. This guide walks through migrating to create_agent — what the parameter rename means, which hooks map to which middleware, and what can break in production if you don't test the migration carefully.

intermediate12 min

Persistence: Never Lose State

LangGraph checkpointers auto-save state after every superstep — enabling resume after crashes, human-in-the-loop pauses, and time travel. This article explains when to use persistence, which backend to choose, the write-amplification trap that breaks production, and how to monitor it before it pages you.

intermediate14 min

Threads: Isolation, Lifecycle & Cleanup

A LangGraph thread is an isolated state timeline keyed by a string. One compiled graph serves thousands of users simultaneously, each with their own checkpoint chain. This article covers ID design for multi-tenant apps, the full thread lifecycle from create to TTL expiry, concurrency gotchas, and production cleanup strategies.

intermediate14 min

Long-term Memory with Store

LangGraph Store is a cross-thread key-value database that persists facts across every conversation a user has. This article covers when to use it, how the full API works including TTL and semantic search, the hard problem of memory extraction, a production-grade load-reason-save pattern, and the five failure modes that break agents in production.

intermediate18 min

Time Travel

Time travel gives you a flight recorder for your agent: load any past checkpoint, replay from it, or fork with corrected state. The API is three methods — the hard part is knowing when to use them and how not to cause production incidents in the process.

advanced14 min

Durable Execution

LangGraph checkpoints state after every super-step, enabling crash recovery without data loss. Three durability modes — exit, async, sync — let you trade off write overhead against recovery guarantees.

intermediate14 min

Runtime & Context: Dependency Injection

Runtime[Context] is LangGraph's typed dependency injection system — giving every node and tool access to per-run configuration, cross-thread storage, execution metadata, and transient state through a single parameter. This article covers the full data placement decision (including UntrackedValue for non-serializable objects), the corrected subgraph propagation behavior, and the production failure modes that trip teams on first deployment.

intermediate16 min

Security & Lifecycle: Encryption, TTL, and Backend Selection

Decide whether you need checkpoint encryption or TTL, configure them correctly (including the actual JSON field names), harden serialization against CVE-2025-64439, choose from the real set of checkpointer backends, and plan for the failure modes that matter: key loss, silent misconfiguration, and deserialization RCE.

advanced14 min

Human-in-the-Loop

Use interrupt() to pause agent execution for human approval, collect input mid-run, and resume with Command. Covers the node restart trap, production checkpointers, and the five anti-patterns that silently break interrupts.

intermediate15 min

LangGraph Streaming v2

The StreamPart TypedDict protocol, all 7 stream modes, subgraph streaming, and the production FastAPI/SSE pattern — with the correct API, not the broken dot-notation examples most tutorials ship.

intermediate16 min

Pre/Post Model Hooks

The right place to intercept every LLM call is not inside your nodes. LangGraph's pre_model_hook and post_model_hook, and LangChain 1.0's AgentMiddleware, give you a composable layer for context trimming, guardrails, cost tracking, and output validation — without polluting business logic.

intermediate14 min

Branching & Conditional Routing

LangGraph gives you four routing primitives — add_edge, add_conditional_edges, Command, and Send — each for a distinct scenario. This article teaches you when to use which one, how to build LLM-powered routers with real cost math, and what failure modes to design against before you ship.

intermediate14 min

RetryPolicy, Error Taxonomy & CachePolicy

Most LangGraph error handling fails not from missing retry logic, but from misclassifying errors. This article covers how LangGraph handles errors by default, a 4-category taxonomy for routing errors to the right handler, and the three production tools — RetryPolicy, self-healing feedback loops, and CachePolicy — with their failure modes and multi-layer retry pitfalls.

advanced14 min

Subgraphs

Subgraphs compose graphs from smaller graphs with isolated state. Before you use one, understand the checkpoint multiplication cost, failure propagation behavior, and state boundary rules — because a flat graph is almost always simpler.

advanced14 min

Send API & Map-Reduce

Send() creates dynamic parallel branches at runtime — each gets an isolated state copy, not the full graph. This article covers when to choose Send() over simpler alternatives, how to pair it with @defer for correct map-reduce, the modern Command(goto=[Send(...)]) pattern, and the failure modes that appear in production before they appear in tutorials.

advanced14 min

Command API

Command lets nodes decide routing at runtime — atomic state update plus goto in one return value. This guide covers all four parameters (goto, update, resume, graph), type annotations that preserve graph visualization, fan-out patterns, multi-agent handoff with loop guards, and the interrupt/resume pattern for human-in-the-loop workflows.

advanced18 min

Deferred Nodes

Most parallel-graph bugs happen at the reduce step: the node fires after the first branch completes, not all of them. defer=True on add_node() turns any node into a synchronization barrier — it stays queued until every pending task in the run has finished, so it always sees the complete accumulated state.

advanced11 min

Node Caching

Cache expensive node results with CachePolicy on add_node(). Cache backends for dev vs. production, custom key_func, and the pickle deserialization CVE you need to patch.

advanced10 min

★Functional API

The Functional API is LangGraph's decorator-based alternative to StateGraph. It gives you the same checkpointing, streaming, and human-in-the-loop guarantees with plain Python functions — but only works for linear and branching workflows. This article covers when to use it, how it fails in production, and the one deployment gotcha that silently breaks task caching.

intermediate15 min

Graph API vs Functional API

LangGraph has two APIs: the Graph API (StateGraph) and the Functional API (@entrypoint + @task). Both compile to the same runtime. The choice is about debugging visibility, testability, and team collaboration — not capability. This article helps you pick the right one and avoid the failure modes that catch engineers off guard.

intermediate14 min

LangGraph Studio

LangGraph Studio is a browser-based IDE for debugging LangGraph agents — it visualizes your graph, lets you inspect state at every checkpoint, set breakpoints, and replay production traces locally. This article covers when to reach for it (and when not to), how to set it up in under five minutes, and the Studio v2 workflow for reproducing production failures.

intermediate12 min