Input & Output Validation
Production agents have four validation gates — input, tool args, tool results, and output. Miss any one and bad data silently crosses a trust boundary. This guide covers decision-order: when to validate, what each gate checks, how to wire it correctly in LangGraph, and what breaks in production when you skip the work.
Quick Reference
- →Four trust boundaries: user→agent (input), agent→tool (tool args), tool→agent (tool results), agent→user (output)
- →Input validation: reject before spending tokens — length, format, token count, content policy, PII
- →Output validation: schema via native structured output (constrained decoding) or with_structured_output; always validate before serving
- →PII: regex covers structured PII (<5ms); NER (Presidio/spaCy) covers names and addresses (50-200ms) — choose based on compliance scope
- →Business rules: Pydantic model_validator enforces domain constraints the LLM cannot know (refund caps, return windows, role gates)
- →LangGraph wiring: agent→tools uses conditional edges, NOT two unconditional edges — a common graph bug that causes silent failures
- →Validation overhead is real: regex <5ms, NER 50-200ms, structured output retry adds 1-2 LLM calls — budget accordingly
- →False positives in PII detection block legitimate content; maintain an allow-list for patterns your domain deliberately uses
When (Not) to Validate
Validation is not free. Each gate adds latency, can produce false positives that frustrate users, and requires a maintenance burden when business rules change. The question is not whether to validate — it is where validation pays for itself and where it doesn't.
| Context | Validate? | Reason |
|---|---|---|
| Internal tool with controlled inputs and no user data | Light or skip | Low blast radius; strict validation adds latency with no compliance benefit |
| Public-facing chatbot handling user messages | Yes — all four gates | User input is untrusted; output can expose PII from tool results |
| Batch pipeline with pre-validated structured data | Output only | Input provenance is known; output schema still needs enforcement |
| Healthcare or financial agent handling PII | Yes — with NER | Regulatory compliance (HIPAA, GDPR, PCI-DSS) requires PII detection on both sides |
| Prototype or demo without real user data | Skip or stub | Premature validation adds friction; wire the gate but no-op the logic |
Regex PII patterns flag IP addresses in infrastructure logs, phone-number-shaped IDs in order systems, and email-like patterns in template strings. A single overzealous deny-by-default rule can block 5-10% of legitimate traffic. Measure your false-positive rate before shipping a PII gate to production.
If you decide to validate, put the gate at the right boundary. The four trust boundaries in an agent system each have a different failure mode:
| Boundary | Direction | Primary Risk | Failure Without Validation |
|---|---|---|---|
| API entry | User → Agent | Malformed input, excessive tokens, PII in context | Token waste, crashes, PII leakage into logs |
| Tool arguments | Agent → Tool | SQL injection, path traversal, parameter abuse | Data corruption, unintended writes, API overcharges |
| Tool results | Tool → Agent | Indirect prompt injection, PII from DB/API responses | Injected instructions hijack agent behavior |
| Agent response | Agent → User | Hallucinated schema, PII surfaced from context | Invalid JSON in downstream systems, GDPR violations |
Four validation gates — input, tool args, tool results, output — each at a different trust boundary