Structured Error Responses for MCP

Returning structured errors from MCP tools so Claude can recover gracefully. The isError flag, error categories, retryable vs non-retryable distinctions, and how subagents should implement local recovery for transient failures while propagating only unresolvable errors.

Quick Reference

→MCP tools signal errors using the isError: true flag in the tool result
→Embed structured JSON in the content text: errorCategory, isRetryable, description, customerMessage
→Four error categories: transient, validation, business, permission -- each requires different handling
→Retryable errors (transient) should include retry guidance: delay, maxRetries
→Non-retryable errors (validation, permission) should never be retried -- they will always fail
→Generic 'Operation failed' errors prevent Claude from making intelligent recovery decisions
→Subagents should implement local recovery for transient failures (retry 2-3 times internally)
→Only propagate unresolvable errors to the orchestrator -- don't bubble up every timeout
→Distinguish access failures from valid empty results -- empty is not an error
→Include customerMessage for user-facing errors separate from technical description

Why Structured Errors Matter for Agents

When an MCP tool fails, what happens next depends entirely on the information in the error response. If the error says 'Operation failed,' Claude has no basis for deciding whether to retry, try a different tool, ask the user for clarification, or give up. Structured error responses give Claude the metadata it needs to make intelligent recovery decisions -- the same way a well-designed API gives calling code the information it needs to handle failures gracefully.

Exam trap: Generic error messages

The exam will present scenarios where an agent receives 'Operation failed' or 'Error occurred' and ask what the problem is. The answer is always that generic errors prevent intelligent recovery. Claude cannot distinguish a temporary timeout (retry in 2 seconds) from a permanent permission denial (ask user for credentials) when both return the same message.

In a multi-agent system, error handling becomes even more critical. An orchestrator agent that receives an error from a subagent's tool call needs to decide: should it retry the subagent, route to a different subagent, escalate to the user, or proceed without that data? Structured errors make this decision possible.

The MCP isError Flag Pattern

MCP tool results include an isError boolean flag. When set to true, it signals to Claude that the tool call did not succeed. The actual error details go in the content array as a text block. The key insight is that this text block should contain structured JSON, not a plain string.

Error Categories: The Four Types

Not all errors are equal. A network timeout is fundamentally different from an invalid input, which is different from a business rule violation, which is different from a permissions failure. Each requires a different recovery strategy. Categorizing errors lets Claude choose the right response.

Sign in to read this article

This is a premium article. Sign in with your Google account to continue.