Agent Architecture/Single-Agent Patterns
★ OverviewIntermediate20 min

How to Design an Agent System

A decision framework for choosing between chains, single agents, and multi-agent systems. Covers when not to build an agent at all, cost estimation before you write code, the six failure modes every production agent hits, model tiering strategy, and a production-shaped LangGraph reference implementation.

Quick Reference

  • If a human can write a fixed checklist for the task, use a chain — not an agent
  • Start with a chain; promote to an agent only when the LLM must choose tools at runtime
  • Keep tools under 8-10 per agent — selection accuracy degrades sharply beyond that
  • Estimate per-query cost before building: (input_tokens × price + output_tokens × price) × avg_calls
  • Prototype with Claude Opus 4.7 to establish the quality ceiling; ship with Sonnet 4.6
  • Set max_iterations (5-15) and a cost ceiling to prevent runaway loops
  • Build 50-100 hand-labeled eval cases before your second prompt iteration
  • Instrument token usage, iteration count, error rate, and latency p95 from day one

When NOT to Build an Agent

The most important design decision is the one you don't make. Most tasks that feel like they need an agent can be solved with a chain, a single LLM call, or no LLM at all. An agent adds latency (2-8 LLM calls vs 1), cost (5-15x a chain), and debugging surface. That tax must be justified by genuinely dynamic behavior — not because agents feel more impressive.

Task shapeExampleRight toolWhy not an agent
Extract structured data from textParse name, email, company from a business cardSingle LLM call with structured outputThe steps are always the same — one extraction call
Fixed pipeline with known stagesTranslate → summarize → format → postChainEvery input follows the same path; no runtime branching needed
Classify into one of N categoriesRoute a support ticket to billing / technical / generalRouter (one LLM call)Classification is a single structured output, not a tool-calling loop
Retrieval + answer (RAG)Answer a question from your documentationChain (retrieve → generate)The steps are fixed; the LLM doesn't decide which tools to call
Dynamic tool selection with judgmentResearch a company and write a personalized sales emailSingle agentThe LLM genuinely needs to decide which searches to run and in what order
Multi-domain coordinationRoute billing AND engineering issues, each needing domain expertiseMulti-agent (supervisor pattern)Two distinct context sets that don't fit cleanly in one agent's system prompt
Real project

A payments team built a multi-agent system to process vendor invoices: an extraction agent, a validation agent, and an approval agent. After two weeks of debugging coordination failures, they realized every invoice followed the same 3-step path — extract fields, validate against PO, write to ledger. A deterministic chain handled all of it in 300ms at $0.003/invoice. The multi-agent system averaged 4 seconds and $0.18. The dynamic behavior they thought they needed was two if/else branches.

Learn this in → prompt-chaining

The agent tax is real

At 10K queries/day, the difference between a chain ($0.003/query) and a single agent ($0.05/query) is $47/day vs $500/day — $16K vs $180K annually. That gap must be justified by the business value the dynamic behavior delivers. If it can't be, use the chain.