Router: Classify and Dispatch
The Router pattern classifies input once and dispatches to specialized handlers — fast and cheap when categories are clearly separable. Most articles teach the mechanics; this one starts with whether you should build one at all, then walks through cost math, per-class eval, drift detection, the three places routing actually fails in production, and a 30-day shipping runbook.
Quick Reference
- →A router earns its keep only when handlers are meaningfully different AND no single class > ~60% of traffic
- →Top-line accuracy hides per-class collapse — measure recall and precision per class, not overall
- →The cheapest optimization is a 100-example golden set you actually run, not a better prompt
- →Use Haiku for the classifier, Sonnet for the handlers — model tiering cuts router cost ~3x
- →Self-reported confidence is imperfect but beats no threshold — set per-class min recall as a CI gate
- →Drift is a taxonomy failure, not a classifier failure — alert on route distribution, not just accuracy
- →Hybrid (rules → LLM fallback) + Anthropic prompt caching is the production default
Should I Use a Router at All?
Classify once → dispatch to the right handler → no iteration, no coordination
Most articles open with how to build a router. Start one step earlier: should you build one at all? A router adds a hop, a classification call, and a permanent maintenance surface — the category taxonomy, the eval, the drift monitor, the fallback. Half the time, that overhead doesn't earn its keep.
The Router pattern uses an LLM (or deterministic logic) to classify an input into one of several categories, then dispatches it to a specialized handler. Unlike a Supervisor, the Router makes a single routing decision — it does not iterate, coordinate, or synthesize results across handlers.
| When the router is mostly tax | Why it fails to earn its keep | Do this instead |
|---|---|---|
| One handler takes >70% of traffic | Most calls pay routing cost for no real choice | Send everything to that handler; let it escalate the rest |
| Handlers use the same model + similar prompts | You're choosing between near-identical paths | Use a single handler with a `mode` parameter in its system prompt |
| Categories overlap or are fuzzy | Misclassification is common; fallback fires constantly | Use Parallelization (run several handlers, pick best) or Multi-label routing |
| Tasks need iteration or cross-handler coordination | Router is one-shot; you'll bolt on retry/replan logic | Use a Supervisor pattern from day one |
| Aspect | Router | Supervisor | Orchestrator-Worker |
|---|---|---|---|
| LLM calls for routing | 1 | 1+ per iteration | 1+ (planning) |
| Coordination | None | Iterative | Plan-based |
| Result synthesis | None (handler returns directly) | Supervisor combines | Orchestrator combines |
| Latency | Low (single hop) | Higher (multi-turn) | Higher (plan + execute) |
| Failure mode | Silent misroute as categories drift | Loops, infinite delegation | Partial-failure aggregation |
| Best for | Clearly separable tasks | Tasks needing judgment | Complex multi-step tasks |
Router cost per query is small in absolute terms — usually under 25% of total inference cost. The real cost is *complexity*: owning a taxonomy, an eval harness, a fallback, and a drift monitor for as long as the router lives. Don't pay that cost without a reason. The Router earns its keep when handlers are meaningfully different (different tools, model tiers, or SLAs) AND traffic is meaningfully split (no single class above ~60%).