Multi-Agent Systems
When and why to split work across multiple agents — with cost math, a pattern-selection decision tree, and the production guardrails most overviews skip.
Quick Reference
- →Multi-agent = multiple specialized agents collaborating on a task, each with its own tools and prompt
- →Four patterns: supervisor (centralized), swarm (peer handoffs), async subagents (background tasks), A2A (cross-service protocol)
- →Agents communicate through shared state in LangGraph — the state schema is the API contract between agents
- →Iteration limits: use recursion_limit on .compile() or invoke config — max_rounds does not exist on create_supervisor
- →pre_model_hook trims context before each supervisor LLM call; post_model_hook adds guardrails after
- →parallel_tool_calls=True lets the supervisor delegate to multiple agents simultaneously
- →Multi-agent multiplies token cost — the supervisor re-reads full history every round. Run the numbers before splitting.
Do You Actually Need Multi-Agent?
Multi-agent adds latency, cost, and debugging difficulty. Start with a single well-prompted agent. Only split when you can measure what the split improves.
Multi-agent is a scaling strategy, not a default. Before splitting, establish a baseline: run your single-agent on a representative eval set, measure task success rate, and profile which failures are caused by tool confusion vs. prompt scope vs. model capability. If failures cluster around two genuinely different skill domains, splitting makes sense. If they cluster around prompt quality or tool descriptions, fix those first.
| Signal | What it looks like | Action |
|---|---|---|
| Tool selection errors | Agent picks the wrong tool as the tool list grows | Split by domain — each agent gets only the tools it needs |
| Prompt covers too many roles | System prompt trying to handle research, writing, and review simultaneously | Each agent gets a focused system prompt for one role |
| Independent parallel subtasks | Subtasks don't depend on each other and could run concurrently | Use Send API for parallel fan-out within a single agent, or multi-agent |
| Different trust / access levels | Some tools need DB write access, others should be sandboxed | Isolate sensitive tools to a restricted agent with tighter guardrails |
| Independent deployment cycles | Different teams own different parts of the workflow | Separate agents allow independent versioning and deployment |
You'll read 'split when you have more than 10-15 tools.' That number isn't cited anywhere. The real threshold is when your eval set shows routing or selection failures that a focused prompt can't fix. Measure first.