Agent Architecture/System Design
★ OverviewIntermediate18 min

Designing Agent Workflows

When to use a graph instead of a chain, how to choose the right topology, how to design nodes and state for testability, and how to add human-in-the-loop gates with the current interrupt() API — a decision-first guide to LangGraph workflow design.

Quick Reference

  • Ask 'do I need a graph?' first — if no conditional logic, no cycles, no HITL gates, a chain or function is simpler
  • Map each distinct step to a graph node — keep nodes single-responsibility (one LLM call, one tool, or one routing decision)
  • Use conditional edges for decision points; use Command(goto=) when the node itself knows the next destination
  • Define your state schema upfront with TypedDict — what flows between nodes, what persists, what is ephemeral
  • Place HITL gates before irreversible actions using interrupt() inside nodes — resume with Command(resume=value)
  • Every additional LLM-calling node adds cost; checkpoint writes add 5-20ms per node — measure before adding complexity
  • Validate at three levels: compile() catches wiring errors, unit tests verify node logic, LangSmith traces verify end-to-end flow

Do You Even Need a Graph?

The first design decision is whether to reach for a StateGraph at all. Graphs add checkpoint overhead, state serialization cost, and debugging complexity. If your workflow has no conditional logic, no cycles, and no human-in-the-loop gates, a chain or a plain function is simpler, faster, and easier to test.

Your workflow has...UseWhyExample
Linear steps, no branchingLCEL chainNo checkpoint overhead, simpler tracingfetch -> summarize -> format
One conditional branchif/else in a functionA graph buys nothing over a 3-line conditionalif intent == 'billing': ... else: ...
2+ conditional branchesStateGraph with conditional edgesGraph makes routing explicit and testableclassify -> route -> billing | tech | general
Cycles (agent loops)StateGraph with recursion_limitCycles require explicit exit conditionsreason -> act -> observe -> (loop or exit)
Human approval gatesStateGraph + interrupt() + checkpointerState must persist between invocations across the approval waitdraft -> interrupt() -> human approves -> send
Parallel branches that rejoinStateGraph with fan-out/fan-inReducers merge concurrent state updates correctlysearch docs AND check inventory simultaneously
Resist the framework reflex

A graph adds checkpoint overhead, state serialization cost, and debugging complexity. If you can solve the problem with a 20-line function and an if/else, do that instead. Reach for StateGraph only when branching or persistence adds real value.

Real project

A team built a 7-node StateGraph for a summarization pipeline that had zero conditional logic — every node always ran in the same order. Replacing it with an LCEL chain cut code from 120 lines to 35, removed the checkpointer dependency, and reduced latency by ~40% by eliminating checkpoint writes between nodes.

Learn this in → 3 min