★ OverviewAdvanced12 min
Shipping to Production
The complete checklist for taking an agent from prototype to production: infrastructure, monitoring, rollout strategies, and incident response.
Quick Reference
- →Production readiness checklist: rate limiting, auth, input validation, output guardrails, monitoring, and alerting
- →Use LangGraph Platform for managed deployment or self-host with Docker + Redis for state persistence
- →Implement canary deployments: route 5% of traffic to the new agent version, monitor, then ramp up
- →Set up alerting on key metrics: p95 latency, error rate, token usage per request, and tool failure rate
- →Keep a human escalation path for every agent — production agents must be able to say 'I need a human'
Production Readiness Checklist
Production readiness != feature complete
A production-ready agent is one that fails gracefully, not one that never fails. The checklist ensures you have handled the failure paths.
| Category | Must-Have | Priority |
|---|---|---|
| Auth & Access | API key rotation, per-user rate limits, RBAC on agent actions | P0 |
| Input Validation | Schema validation on all tool inputs, prompt injection detection | P0 |
| Output Guardrails | PII filtering, hallucination detection, response length limits | P0 |
| Monitoring | Structured logging with trace IDs, latency percentiles, token usage dashboards | P0 |
| Alerting | Error rate > 5%, p95 latency > 10s, cost per request spike | P0 |
| Kill Switch | Feature flag to disable agent instantly without redeployment | P0 |
| Fallback | Graceful degradation to static response or human handoff | P1 |
| Load Testing | Validated at 2x expected peak traffic with multi-turn conversations | P1 |
| Runbook | Documented incident response for top 5 failure scenarios | P1 |
| Compliance | Audit logging for all LLM inputs/outputs, data retention policy | P2 |
Work through the P0 items before any production traffic. P1 items should be completed before ramping past 10% of users. P2 items are required for regulated industries but recommended for all.