Prompt Anatomy
The structural components of an LLM prompt: system messages, user messages, and assistant messages. How each part influences model behavior, why system prompts are privileged, and practical demonstrations of how prompt structure transforms output quality.
Quick Reference
- →System prompt: sets role, constraints, output format -- highest-priority instructions
- →User message: the actual task, question, or input data
- →Assistant message: model's previous responses, used for few-shot examples and conversation history
- →System prompts are processed with elevated priority by most models (especially Claude)
- →Few-shot examples in assistant messages are more effective than instructions alone
- →The order and structure of prompt components significantly affects output quality
In this article
The Three Message Roles
Modern LLM APIs use a chat completion format with three message roles: system, user, and assistant. Each role has a distinct function and different levels of influence on the model's behavior. Understanding these roles is the foundation of effective prompt engineering.
| Role | Purpose | Priority | Typical content |
|---|---|---|---|
| system | Define behavior, constraints, and persona | Highest (privileged) | Role definition, output format, rules, guardrails |
| user | Provide the task or question | Standard | Questions, input data, specific instructions |
| assistant | Model's previous responses | Context | Few-shot examples, conversation history |
System prompts are not just 'the first message.' They are processed with elevated priority by the model. In Claude, the system prompt is architecturally separate from the conversation and gets stronger adherence. In OpenAI models, it is a separate parameter with similar privileged treatment. Always put your most important instructions in the system prompt.
Designing Effective System Prompts
The system prompt is your primary lever for controlling model behavior. A well-structured system prompt defines who the model is, how it should respond, and what constraints it must follow.
- ▸Lead with role definition: who is the model and what is its expertise?
- ▸Specify constraints explicitly: what should the model NOT do?
- ▸Define output format: structure, length, style expectations
- ▸Include guardrails: safety boundaries and redirect behavior
- ▸Use markdown headers (##) to organize long system prompts -- models respect structure
System prompts between 200-800 tokens tend to work best. Shorter than 200 and you lack specificity. Longer than 800 and the model starts to lose track of less prominent instructions. If your system prompt exceeds 1000 tokens, consider whether some instructions belong in the user message instead.
Structuring User Messages
The user message contains the actual task. How you structure it dramatically affects output quality. The key principles are: be specific, provide context, and separate instructions from data.
Wrapping distinct sections in XML-style tags (<document>, <instructions>, <examples>) helps the model clearly distinguish between instructions and data. This is especially important when your input data might contain text that looks like instructions. Claude has particularly strong adherence to XML-delimited sections.
Assistant Messages and Few-Shot Examples
Assistant messages represent previous model outputs. They serve two critical purposes: providing conversation context in multi-turn interactions, and serving as few-shot examples that demonstrate the desired output format and quality.
For most classification and extraction tasks, 3-5 examples are sufficient. Include at least one example of each category, one typical case, and one edge case. More examples give diminishing returns and consume context. If you need more than 10 examples to get good results, consider fine-tuning instead.
- ▸Few-shot examples teach format by demonstration, which is more reliable than format instructions alone
- ▸Include diverse examples that cover different categories and edge cases
- ▸The quality of examples matters enormously -- use your best, cleanest outputs as examples
- ▸For conversation context, summarize older messages rather than including the full history
- ▸Prefilling the assistant response (starting it for the model) can steer the output format reliably
How Structure Transforms Output
The same underlying task can produce dramatically different quality outputs depending on prompt structure. Here is a real comparison showing how progressively better prompt structure improves a code review task.
A well-structured 300-token prompt consistently outperforms a poorly-structured 1000-token prompt. The model needs clarity about what you want, not volume. Invest time in structure (roles, sections, format specifications, examples) rather than writing more words.
Best Practices
Do
- ✓Put your most important instructions in the system prompt -- it has the highest priority
- ✓Use XML-style tags to separate instructions from data in user messages
- ✓Include 3-5 few-shot examples for classification and extraction tasks
- ✓Structure system prompts with clear sections: role, constraints, format, guardrails
- ✓Test your prompt with minimal, moderate, and complex inputs to verify consistent behavior
Don’t
- ✗Don't rely on user messages for critical behavioral constraints -- use the system prompt
- ✗Don't write unstructured wall-of-text prompts -- use headers, bullets, and sections
- ✗Don't include contradictory instructions (e.g., 'be concise' and 'be thorough')
- ✗Don't assume the model will infer your desired output format -- specify it explicitly
- ✗Don't put instructions inside the data section where they might be confused with input
Key Takeaways
- ✓System prompts are architecturally privileged -- put your most critical instructions there.
- ✓User messages should separate instructions from data using clear delimiters (XML tags work best).
- ✓Few-shot examples in assistant messages teach format by demonstration, more reliably than instructions alone.
- ✓Prompt structure (role + constraints + format + examples) matters more than prompt length.
- ✓Progressive improvement: each structural element (system context, formatting, examples) adds measurable quality.
Video on this topic
The anatomy of a perfect AI prompt
tiktok