Agent Architecture/System Design
Advanced11 min

Classification & Routing Patterns

Intent-based routing, semantic routing, multi-level classification, and fallback strategies for directing queries to specialized handlers with confidence thresholds.

Quick Reference

  • Intent classification: classify user query into categories, route to specialized handler for each category
  • Semantic routing: embed the query and compare to route embeddings — the closest match wins
  • Multi-level classification: coarse category first (e.g., 'support'), then fine-grained (e.g., 'billing/refund/technical')
  • Confidence thresholds: only route when classification confidence exceeds a threshold — fall back to a general handler otherwise
  • LLM-based routing is more flexible; embedding-based routing is faster and cheaper — use both in a tiered approach

Why Classification and Routing Matter

Not every query should hit the same agent or model. A simple FAQ question doesn't need a $0.10 Sonnet call with 5 tools — a cached response or a Haiku call is cheaper and faster. Routing classifies incoming queries and sends them to the right handler, optimizing for cost, latency, and accuracy simultaneously.

Routing StrategySpeedCostAccuracyBest For
Keyword matching< 1msFreeLow — brittle rulesKnown exact patterns (commands, codes)
Embedding similarity5-20ms~$0.0001/queryMedium — semantic matchingRouting to templates, FAQ matching
LLM classification500-2000ms$0.001-0.01/queryHigh — understands nuanceComplex routing, ambiguous queries
Multi-level (embedding → LLM)20-2100msVariesHighestProduction systems with diverse traffic