Advanced14 min

Input & Output Validation

Production agents have four validation gates — input, tool args, tool results, and output. Miss any one and bad data silently crosses a trust boundary. This guide covers decision-order: when to validate, what each gate checks, how to wire it correctly in LangGraph, and what breaks in production when you skip the work.

Quick Reference

→Four trust boundaries: user→agent (input), agent→tool (tool args), tool→agent (tool results), agent→user (output)
→Input validation: reject before spending tokens — length, format, token count, content policy, PII
→Output validation: schema via native structured output (constrained decoding) or with_structured_output; always validate before serving
→PII: regex covers structured PII (<5ms); NER (Presidio/spaCy) covers names and addresses (50-200ms) — choose based on compliance scope
→Business rules: Pydantic model_validator enforces domain constraints the LLM cannot know (refund caps, return windows, role gates)
→LangGraph wiring: agent→tools uses conditional edges, NOT two unconditional edges — a common graph bug that causes silent failures
→Validation overhead is real: regex <5ms, NER 50-200ms, structured output retry adds 1-2 LLM calls — budget accordingly
→False positives in PII detection block legitimate content; maintain an allow-list for patterns your domain deliberately uses

When (Not) to Validate

Validation is not free. Each gate adds latency, can produce false positives that frustrate users, and requires a maintenance burden when business rules change. The question is not whether to validate — it is where validation pays for itself and where it doesn't.

Context	Validate?	Reason
Internal tool with controlled inputs and no user data	Light or skip	Low blast radius; strict validation adds latency with no compliance benefit
Public-facing chatbot handling user messages	Yes — all four gates	User input is untrusted; output can expose PII from tool results
Batch pipeline with pre-validated structured data	Output only	Input provenance is known; output schema still needs enforcement
Healthcare or financial agent handling PII	Yes — with NER	Regulatory compliance (HIPAA, GDPR, PCI-DSS) requires PII detection on both sides
Prototype or demo without real user data	Skip or stub	Premature validation adds friction; wire the gate but no-op the logic

Over-validation breaks agents silently

Regex PII patterns flag IP addresses in infrastructure logs, phone-number-shaped IDs in order systems, and email-like patterns in template strings. A single overzealous deny-by-default rule can block 5-10% of legitimate traffic. Measure your false-positive rate before shipping a PII gate to production.

If you decide to validate, put the gate at the right boundary. The four trust boundaries in an agent system each have a different failure mode:

Boundary	Direction	Primary Risk	Failure Without Validation
API entry	User → Agent	Malformed input, excessive tokens, PII in context	Token waste, crashes, PII leakage into logs
Tool arguments	Agent → Tool	SQL injection, path traversal, parameter abuse	Data corruption, unintended writes, API overcharges
Tool results	Tool → Agent	Indirect prompt injection, PII from DB/API responses	Injected instructions hijack agent behavior
Agent response	Agent → User	Hallucinated schema, PII surfaced from context	Invalid JSON in downstream systems, GDPR violations

Four validation gates — input, tool args, tool results, output — each at a different trust boundary

Input Validation

Input validation runs at the API boundary, before the agent processes the message. Reject here and you spend zero tokens. Reject at the output and you've already paid for the LLM call. Use Pydantic models to define and enforce schemas declaratively — FastAPI will run them automatically before your handler function fires.

Output Schema Validation

Never pass LLM output downstream without validation

LLMs produce almost-valid JSON: trailing commas, unescaped newlines inside strings, missing closing braces. Even with structured output enabled, semantic errors (negative amounts, past dates, out-of-range confidence scores) pass the grammar check. Always parse, validate the schema, and check semantic constraints.

Sign in to read this article

This is a premium article. Sign in with your Google account to continue.