All Topics

Agent Architecture

All agent patterns in one place: single-agent (ReAct, Reflection), multi-agent (Supervisor, Swarm, A2A), workflow (Router, Orchestrator-Worker), plus system design, memory, and frontend.

0/48
How to Design an Agent System

A decision framework for choosing between chains, single agents, and multi-agent systems. Covers when not to build an agent at all, cost estimation before you write code, the six failure modes every production agent hits, model tiering strategy, and a production-shaped LangGraph reference implementation.

intermediate20 min
ReAct: The Core Pattern

The foundational agent loop: decide when ReAct earns its cost over simpler patterns, understand the token math behind each iteration, learn the five ways it fails in production, and ship it with create_agent, cost controls, and eval.

intermediate14 min
Reflection & Self-Critique

When to add a self-critique loop, what it costs, where it fails, and how to measure whether it's earning its latency. Includes conditional-edge and Command-based LangGraph implementations, production-shaped code with regression guards, and a clear comparison with Evaluator-Optimizer.

intermediate16 min
Plan-and-Execute: When to Plan Upfront

Plan-and-Execute separates reasoning from acting: one LLM call decomposes the task into ordered steps, then an executor runs each step sequentially. Most tutorials stop at the mechanics. This article starts with whether you should use P&E at all, walks through the cost tradeoff (P&E is ~50% more expensive than ReAct but ~7% more accurate on complex tasks), covers the replan problem that dominates production failures, adds PEV quality gates to catch silent step drift, and ends with a model-tiered reference implementation that cuts cost ~85% vs naive Sonnet-everywhere.

advanced18 min
Prompt Chaining: Sequential LLM Pipelines

Prompt chaining sequences focused LLM calls — each step's output becomes the next step's sole input, with gate functions between steps acting as circuit breakers. This article covers the decision framework for when to use it, the cost and latency math, what fails in production, and how to evaluate and debug chains.

intermediate14 min
Evaluator-Optimizer: Self-Improving Loops

Two LLMs, two roles: one generates, one judges. The Evaluator-Optimizer pattern runs a structured feedback loop until output clears a quality threshold — or a cost budget runs out. Before using it, you need to answer two questions: does your task have measurable quality criteria, and does it actually benefit from more than one attempt?

advanced14 min
Router: Classify and Dispatch

The Router pattern classifies input once and dispatches to specialized handlers — fast and cheap when categories are clearly separable. Most articles teach the mechanics; this one starts with whether you should build one at all, then walks through cost math, per-class eval, drift detection, the three places routing actually fails in production, and a 30-day shipping runbook.

advanced18 min
Orchestrator-Worker: Plan, Delegate, Synthesize

The Orchestrator-Worker pattern parallelizes complex tasks by having one LLM plan the decomposition, multiple workers execute in parallel, and the orchestrator synthesize the results. This article covers when it actually earns its 4× cost premium, how to validate plan quality before spending on workers, production hardening, and how to run an A/B eval against a single-agent baseline.

advanced20 min
Parallelization: Concurrent LLM Execution

Parallelization runs multiple LLMs concurrently to gain confidence (voting: same input, N opinions) or speed (sectioning: split input, parallel workers). This article starts with whether you should parallelize at all, walks through the actual cost math — 3× Haiku can be cheaper than 1× Sonnet with caching — and covers the two production failure modes most articles skip: superstep atomicity and correlated errors.

advanced16 min
Skills: On-Demand Specialization

Skills inject domain expertise into an agent on demand via progressive disclosure — keeping the context window lean while giving the agent access to deep knowledge across many domains. This article covers when to use skills over subagents and tools, the full SKILL.md specification, derivable context budget math, how skills fail in production, and how to evaluate skill activation.

advanced18 min
Multi-Agent Systems

When and why to split work across multiple agents — with cost math, a pattern-selection decision tree, and the production guardrails most overviews skip.

advanced11 min
Supervisor Pattern

The supervisor pattern coordinates specialized worker agents by calling them as tools. Learn when it earns its cost, how to build it with LangChain 1.0's current API, and how to evaluate whether it actually outperforms a single agent.

advanced18 min
Swarm & Handoffs

When peer-to-peer agent handoffs earn their complexity, what they cost per handoff depth, how they fail in production, and how to defend against each failure mode. Includes production-grade LangGraph code with checkpointer, context management strategies with cost math, and a sharpened comparison with the supervisor pattern.

advanced14 min
Async Subagents: Background Task Delegation

Async subagents (Deep Agents v0.5) let a supervisor delegate long-running tasks to background agents while continuing to chat with the user. This article covers the decision criteria for when async is worth the complexity, token cost math, production error handling, five concrete failure modes and their defenses, three orchestration patterns with code, and the five metrics you need to monitor before something breaks.

advanced16 min
A2A Protocol

Google's Agent-to-Agent (A2A) protocol — now at v1.2 with 150+ organizations in production — standardizes how agents discover, authenticate, and communicate across service boundaries using JSON-RPC over HTTP. Covers the Protocol Triangle (A2A, MCP, AG-UI), the message-based API, signed Agent Cards, LangGraph integration, and production failure modes.

advanced14 min
Agent Memory Systems

When to add memory to your agent, how the two-layer architecture works, what it costs in tokens and money, and the six ways it fails silently in production.

intermediate15 min
LangMem SDK & Store API

LangMem is an LLM-powered extraction layer that automatically identifies and persists structured facts from conversations. This article covers when to use it (and when not to), all three APIs with correct signatures, cost analysis, memory quality evaluation, failure modes, and GDPR deletion.

advanced20 min
Agent Prompt Design

Agent system prompts are operating contracts, not personality descriptions. This article covers how to structure them with XML tags, write tool usage rules that actually enforce behavior, defend against prompt injection, use adaptive thinking correctly, and build a prompt evaluation harness that gates every change.

intermediate18 min
Writing Effective Tool Descriptions

Tool descriptions are serialized into every request as the model's only guide for tool selection. This article covers the full anatomy of a production-grade description, the token cost math, disambiguation patterns, schema enforcement with strict mode, and how to measure and debug tool selection quality.

intermediate16 min
Few-Shot Examples for Agent Tasks

When to use few-shot examples for agents, how to build trajectory examples that teach reasoning patterns, static vs dynamic selection with real cost math, and measuring whether examples actually help.

intermediate15 min
Prompt Versioning & A/B Testing

A decision-first guide to managing prompts in production: when to build a registry, how to choose between LangSmith, Langfuse, and LaunchDarkly, how to gate promotions with an eval suite, and how to run statistically rigorous A/B tests instead of guessing.

advanced18 min
Designing Agent Workflows

When to use a graph instead of a chain, how to choose the right topology, how to design nodes and state for testability, and how to add human-in-the-loop gates with the current interrupt() API — a decision-first guide to LangGraph workflow design.

intermediate18 min
State Design Patterns

How to structure LangGraph state for parallel execution, API safety, and long-running workflows — including schema separation, reducers, the Command API, state explosion mitigation, checkpointing, and debugging with time travel.

advanced18 min
Designing for Failure

How to architect agents that don't crash: classify errors before handling them, design timeout budgets at every level, validate state at node boundaries, and verify failure paths with fault injection.

advanced17 min
Context Window Management

When to manage context, how context rot degrades agents before you hit any limit, and the full strategy stack — server-side compaction, context editing, trimming, and summarization — with cost math and production failure modes.

advanced18 min
Application Structure

How to decide when to structure a LangGraph project, how to separate graph topology from node logic from tools, how to navigate the monorepo vs polyrepo decision, what breaks when you get structure wrong, and how the layout evolves as your agent count grows.

intermediate14 min
Classification & Routing Patterns

Query routing is the highest-leverage optimization in an agent system — it determines which model, which tools, and how much context each query gets. This article covers the three routing strategies (keyword, embedding, LLM), how to cascade them in production, how to evaluate and monitor router accuracy, and how to defend against the failure modes that will bite you.

advanced18 min
Graceful Degradation

How to design AI agents that always return something useful — even when LLM APIs fail, rate limits hit, or traffic spikes. Covers fallback chains, circuit breakers, semantic degradation detection, and progressive load shedding.

advanced18 min
Generative UI

Three paradigms for rendering agent output as interactive UI: Static component registries, Declarative agent-described interfaces (A2UI), and Open-Ended agent-generated surfaces. Covers Vercel AI SDK and LangGraph integration, security trust boundaries, error recovery, and testing strategies.

advanced16 min
useStream: Streaming Agent State to React

The useStream React hook connects your UI to a LangGraph agent with real-time streaming — messages, tool progress, interrupts, branching, subagent output, and reconnection. Works with any LangGraph backend via apiUrl or custom transport.

intermediate14 min
OpenAI Function Calling & Responses API

OpenAI's native function calling lets the model invoke your code with structured arguments — no framework required. This article covers when to drop the abstraction layer, how to use the Responses API (the recommended path for new projects), and what breaks in production when tool loops run unchecked.

intermediate14 min
Anthropic Tool Use & Adaptive Thinking

How to build production tool-using agents with the Anthropic SDK: tool definitions with strict mode and input examples, the five tool_choice modes and their interaction with adaptive thinking, server tools, model tier selection with cost math, and the checklist that prevents the most common agent failures.

intermediate20 min
Vercel AI SDK: From Chat UI to Agent Loop

Building production AI applications with Vercel AI SDK 6: the streaming architecture from React hooks to API routes, ToolLoopAgent for agentic workflows with cost controls, structured output with generateObject, provider switching with model tiering, and the failure modes you need to handle before shipping.

intermediate18 min
Agent Framework Comparison

A decision guide for choosing between LangGraph, CrewAI, AG2, OpenAI Agents SDK, Google ADK, Mastra, Vercel AI SDK, and Direct API — structured around cost, lock-in risk, failure modes, and concrete trade-offs rather than feature lists.

intermediate18 min
The No-Framework Agent

How to build a production-ready agent with nothing but the Anthropic SDK and a while loop — and when that's still the right choice. Covers the full manual agent loop, token cost math, streaming, failure modes, testing, and the graduation path from manual loop to tool_runner to Agent SDK to LangGraph.

advanced18 min
Long-Running Agents

When to build an agent that runs for hours instead of seconds — which orchestration framework to choose, how to compute real costs, the five ways long-running agents fail in production, and a reference implementation with checkpointing, error classification, idempotency, and budget enforcement.

advanced18 min
Browser Agents

Building AI agents that navigate and interact with websites: Playwright + LLM for web tasks, page understanding strategies, action spaces, and error recovery patterns.

advanced11 min
Computer Use Agents

When to use computer use versus API automation, the screenshot-analyze-act loop with the current computer_20251124 tool, real cost math that shows context growth dominates price, Docker and ephemeral VM sandboxing with prompt injection defense, verification and stuck detection, production failure modes, and a reference implementation using the latest Anthropic API.

advanced18 min
Code Execution Agents

Build agents that generate, execute, and iterate on code safely. Covers managed sandboxes (Claude's native code execution tool, E2B), self-hosted Docker, the security gap between 'code ran' and 'answer is correct', and cost math for each option.

advanced18 min
Agent Supervision & Safety

Building supervision layers for autonomous agents: kill switches, permission systems, human approval gates, monitoring dashboards, and complete audit logging for post-mortem analysis.

advanced10 min
Project: Build a Research Agent

End-to-end walkthrough: build a multi-source research agent with planning, parallel web search, subagent delegation, filesystem persistence, and report synthesis using Deep Agents.

advanced15 min
Project: Build a Customer Support Agent

End-to-end walkthrough: build a customer support agent with query routing, RAG knowledge base, tool-calling for account actions, human-in-the-loop escalation, and multi-tenant auth.

advanced14 min
Project: Build a RAG Q&A System

End-to-end walkthrough: build a production RAG system with ingestion pipeline, hybrid search, self-corrective retrieval, answer validation, and continuous evaluation.

advanced14 min
Project: Build a Code Review Agent

End-to-end walkthrough: build a code review agent with Deep Agents + sandbox for safe code analysis, project-specific skills, parallel file review, and GitHub integration.

advanced13 min
Agents for Legal

How to build production agents for contract analysis, compliance checking, legal research, and document review — with the guardrails that regulated environments demand.

advanced14 min
Agents for Finance

Building production agents for financial research, risk assessment, portfolio analysis, and report generation — with the numerical accuracy, audit trails, and regulatory compliance that finance demands.

advanced15 min
Agents for Healthcare

Building production agents for clinical decision support, patient documentation, and medical Q&A — with HIPAA compliance, safety guardrails, and the principle that AI assists clinicians but never replaces clinical judgment.

advanced14 min
Agents for Customer Support

Design patterns for production customer support agents: multi-tier routing, RAG knowledge bases, account action tools, HITL escalation, session memory, and satisfaction tracking.

intermediate9 min