Agent Prompt Design
Agent system prompts are operating contracts, not personality descriptions. This article covers how to structure them with XML tags, write tool usage rules that actually enforce behavior, defend against prompt injection, use adaptive thinking correctly, and build a prompt evaluation harness that gates every change.
Quick Reference
- →Use three sections wrapped in XML tags: <identity>, <tool_usage_rules>, and <boundaries>
- →Write tool rules in two halves: USE WHEN and DO NOT USE WHEN — ambiguous descriptions cause random tool selection
- →Wrap all user input in <user_input> tags and instruct the agent to treat it as data, not executable instructions
- →Claude 4.6+ uses adaptive thinking natively — set effort to 'xhigh' for multi-tool agents instead of writing manual CoT
- →Split prompts into a static base (cache_control: ephemeral) and dynamic context injected per request — static base is cached ~90% of calls
- →Prompt caching cuts repeated system prompt cost by ~72% — critical when the prompt is sent on every tool call in the ReAct loop
- →Baseline before you change: measure tool accuracy, task completion, and cost per request before any prompt edit
- →Never ship a prompt change without running the eval suite — a 'better' prompt that fails the eval is a regression
When NOT to Redesign Your Agent Prompt
Before touching the system prompt, run this diagnostic: give the agent the same input 5 times. If it selects different tools across runs, the problem is in the **tool descriptions** — not the system prompt. If it consistently picks the right tool but sends wrong parameters, it's a **tool schema** problem. Only if behavior is consistently wrong *across all tools* — wrong persona, wrong stop conditions, no escalation — is it a system prompt problem. Most 'my agent is broken' reports are tool description issues.
| Symptom | Root cause | Fix |
|---|---|---|
| Agent selects different tools on the same input | Ambiguous tool descriptions | Rewrite tool USE WHEN / DO NOT USE WHEN |
| Agent sends malformed tool parameters | Missing parameter descriptions | Add type + format + example to tool schema |
| Agent stops too early or loops forever | Missing stop conditions | Add explicit stop rules to <boundaries> |
| Agent never escalates to a human | Missing escalation triggers | Add ESCALATE WHEN clause to <tool_usage_rules> |
| Agent's tone or persona is inconsistent | Vague <identity> section | Rewrite <identity> with concrete tone examples |
Tool descriptions are sent inside every tool definition object — they're separate from the system prompt and control per-tool behavior. Rewriting the system prompt to fix tool selection issues almost never helps. Fix the tool description first.