Error States for AI
AI features fail in ways traditional software does not — rate limits, hallucinations, timeouts, tool failures, and model outages. Learn to design error boundaries, fallback strategies, and user-facing error messages that keep users productive even when the AI breaks.
Quick Reference
- →AI error states need to be specific — 'Something went wrong' is never acceptable
- →Rate limits: show a countdown timer and queue the request, do not just show an error
- →Hallucination: design for it with 'I am not confident' states and feedback buttons
- →Model outage: fall back to a simpler model or cached response, not a blank page
- →Timeout: show partial results if available, offer to retry or use a different approach
- →Every error state must offer at least one action: retry, rephrase, fallback, or contact support
AI Error Taxonomy
AI features have a broader error taxonomy than traditional software. Beyond the usual HTTP errors and exceptions, you need to handle model-specific failures (rate limits, token limits, safety filters), quality failures (hallucinations, irrelevant responses), and infrastructure failures (model outages, embedding service down).
| Error Category | Examples | User Impact | Recovery Strategy |
|---|---|---|---|
| Rate limiting | 429 from OpenAI/Anthropic, quota exhaustion | Request rejected | Queue and retry with backoff, show countdown |
| Token limit exceeded | Input too long for context window | Cannot process | Summarize input, split into chunks, use larger model |
| Safety filter | Content flagged by model's safety system | Response blocked | Explain why, suggest rephrasing, offer human help |
| Hallucination | Confident but wrong response | User misled | Confidence indicators, source citations, feedback buttons |
| Tool failure | External API the agent depends on is down | Partial functionality | Skip failed tool, use cached data, inform user of limitation |
| Model outage | Provider API is completely down | Feature unavailable | Fallback to simpler model, cached responses, or manual path |
| Timeout | LLM takes too long to respond | User gives up | Show partial results, offer retry, suggest simpler query |
The fastest way to lose user trust is a generic 'Something went wrong' error when the AI fails. Users need to understand what happened and what they can do about it. Every AI error should be specific, actionable, and honest about the limitation.