Advanced14 min
Send API & Map-Reduce
Send() creates dynamic parallel branches at runtime — each gets an isolated state copy, not the full graph. This article covers when to choose Send() over simpler alternatives, how to pair it with @defer for correct map-reduce, the modern Command(goto=[Send(...)]) pattern, and the failure modes that appear in production before they appear in tutorials.
Quick Reference
- →Send(node_name, arg): spawn an isolated execution of node_name with arg as its entire input state — arg is NOT merged into the parent graph state
- →Return a list of Send() objects from a conditional edge function to fan out to N parallel branches at runtime
- →Each Send() branch gets only its arg dict — branches cannot read each other's state or parent state keys during execution
- →Reducer (e.g., Annotated[list, operator.add]) on the parent state key collects each branch's return value back into the parent
- →@defer on the reduce node guarantees it waits for ALL branches to complete — without it, the reducer may fire after the first branch finishes
- →Command(goto=[Send(...)]) is the modern edgeless pattern — dispatch Send from inside a node, no conditional edge function needed
- →Batch: group N items into M groups, Send() M times — controls parallelism without overwhelming your API rate limit
- →Cost math: N Send() branches × cost-per-LLM-call. Wall-clock time = max(branch durations), not sum.
Should I Use Send()?
Choose Send() when branch count is dynamic and per-branch checkpointing or retry matters
| Approach | When to use | Trade-off |
|---|---|---|
| for loop (sequential) | < 5 items, or order matters between iterations | Simple. No parallelism. Wall-clock time = sum of all calls. |
| asyncio.gather inside a node | Parallel LLM calls where checkpointing is not needed | Fast, low overhead. No per-branch retry or LangSmith branch tracing. |
| Static parallel edges | Fixed number of branches known at graph build time | Explicit, visible in graph visualization. Branch count cannot vary at runtime. |
| Send() | Branch count determined at runtime; each branch needs isolated state; per-branch checkpointing or retry matters | Full graph integration: superstep-checkpointed, retryable, observable in LangSmith. Higher scheduling overhead than asyncio.gather. |
Start with asyncio.gather — graduate to Send() when you need graph-level guarantees
asyncio.gather inside a single node is simpler and faster for parallel LLM calls that don't need per-branch checkpointing or retry. Upgrade to Send() when the branch count varies at runtime, you need each branch's result independently checkpointed, or you want LangSmith branch-level tracing.