★ OverviewAdvanced11 min

Multi-Agent Systems

When and why to split work across multiple agents — with cost math, a pattern-selection decision tree, and the production guardrails most overviews skip.

Quick Reference

→Multi-agent = multiple specialized agents collaborating on a task, each with its own tools and prompt
→Four patterns: supervisor (centralized), swarm (peer handoffs), async subagents (background tasks), A2A (cross-service protocol)
→Agents communicate through shared state in LangGraph — the state schema is the API contract between agents
→Iteration limits: use recursion_limit on .compile() or invoke config — max_rounds does not exist on create_supervisor
→pre_model_hook trims context before each supervisor LLM call; post_model_hook adds guardrails after
→parallel_tool_calls=True lets the supervisor delegate to multiple agents simultaneously
→Multi-agent multiplies token cost — the supervisor re-reads full history every round. Run the numbers before splitting.

Do You Actually Need Multi-Agent?

Earn the complexity first

Multi-agent adds latency, cost, and debugging difficulty. Start with a single well-prompted agent. Only split when you can measure what the split improves.

Multi-agent is a scaling strategy, not a default. Before splitting, establish a baseline: run your single-agent on a representative eval set, measure task success rate, and profile which failures are caused by tool confusion vs. prompt scope vs. model capability. If failures cluster around two genuinely different skill domains, splitting makes sense. If they cluster around prompt quality or tool descriptions, fix those first.

Signal	What it looks like	Action
Tool selection errors	Agent picks the wrong tool as the tool list grows	Split by domain — each agent gets only the tools it needs
Prompt covers too many roles	System prompt trying to handle research, writing, and review simultaneously	Each agent gets a focused system prompt for one role
Independent parallel subtasks	Subtasks don't depend on each other and could run concurrently	Use Send API for parallel fan-out within a single agent, or multi-agent
Different trust / access levels	Some tools need DB write access, others should be sandboxed	Isolate sensitive tools to a restricted agent with tighter guardrails
Independent deployment cycles	Different teams own different parts of the workflow	Separate agents allow independent versioning and deployment

The threshold is your eval set, not a tool count

You'll read 'split when you have more than 10-15 tools.' That number isn't cited anywhere. The real threshold is when your eval set shows routing or selection failures that a focused prompt can't fix. Measure first.

The Cost of Coordination

Multi-agent doesn't just add latency — it multiplies token cost. In a supervisor loop, the supervisor re-reads the full message history before every routing decision. After four agent rounds, the supervisor has processed the original query plus three sets of agent outputs. Each round costs more than the last.

Choosing a Pattern

There are four multi-agent patterns in the LangGraph ecosystem. The decision tree below maps each pattern to the scenario that warrants it. Default to supervisor — it gives you a single audit trail, predictable routing, and the simplest debugging path.

Sign in to read this article

This is a premium article. Sign in with your Google account to continue.