LLM Foundations/Prompt Engineering as a Discipline
★ OverviewBeginner10 min

Prompt Anatomy

The structural components of an LLM prompt: system messages, user messages, and assistant messages. How each part influences model behavior, why system prompts are privileged, and practical demonstrations of how prompt structure transforms output quality.

Quick Reference

  • System prompt: sets role, constraints, output format -- highest-priority instructions
  • User message: the actual task, question, or input data
  • Assistant message: model's previous responses, used for few-shot examples and conversation history
  • System prompts are processed with elevated priority by most models (especially Claude)
  • Few-shot examples in assistant messages are more effective than instructions alone
  • The order and structure of prompt components significantly affects output quality

The Three Message Roles

Modern LLM APIs use a chat completion format with three message roles: system, user, and assistant. Each role has a distinct function and different levels of influence on the model's behavior. Understanding these roles is the foundation of effective prompt engineering.

RolePurposePriorityTypical content
systemDefine behavior, constraints, and personaHighest (privileged)Role definition, output format, rules, guardrails
userProvide the task or questionStandardQuestions, input data, specific instructions
assistantModel's previous responsesContextFew-shot examples, conversation history
System prompt privilege

System prompts are not just 'the first message.' They are processed with elevated priority by the model. In Claude, the system prompt is architecturally separate from the conversation and gets stronger adherence. In OpenAI models, it is a separate parameter with similar privileged treatment. Always put your most important instructions in the system prompt.

Designing Effective System Prompts

The system prompt is your primary lever for controlling model behavior. A well-structured system prompt defines who the model is, how it should respond, and what constraints it must follow.

Anatomy of a production system prompt
  • Lead with role definition: who is the model and what is its expertise?
  • Specify constraints explicitly: what should the model NOT do?
  • Define output format: structure, length, style expectations
  • Include guardrails: safety boundaries and redirect behavior
  • Use markdown headers (##) to organize long system prompts -- models respect structure
System prompt length sweet spot

System prompts between 200-800 tokens tend to work best. Shorter than 200 and you lack specificity. Longer than 800 and the model starts to lose track of less prominent instructions. If your system prompt exceeds 1000 tokens, consider whether some instructions belong in the user message instead.

Structuring User Messages

The user message contains the actual task. How you structure it dramatically affects output quality. The key principles are: be specific, provide context, and separate instructions from data.

Bad vs good user message structure
Use XML-style delimiters

Wrapping distinct sections in XML-style tags (<document>, <instructions>, <examples>) helps the model clearly distinguish between instructions and data. This is especially important when your input data might contain text that looks like instructions. Claude has particularly strong adherence to XML-delimited sections.

Assistant Messages and Few-Shot Examples

Assistant messages represent previous model outputs. They serve two critical purposes: providing conversation context in multi-turn interactions, and serving as few-shot examples that demonstrate the desired output format and quality.

Few-shot examples dramatically improve output consistency
How many examples?

For most classification and extraction tasks, 3-5 examples are sufficient. Include at least one example of each category, one typical case, and one edge case. More examples give diminishing returns and consume context. If you need more than 10 examples to get good results, consider fine-tuning instead.

  • Few-shot examples teach format by demonstration, which is more reliable than format instructions alone
  • Include diverse examples that cover different categories and edge cases
  • The quality of examples matters enormously -- use your best, cleanest outputs as examples
  • For conversation context, summarize older messages rather than including the full history
  • Prefilling the assistant response (starting it for the model) can steer the output format reliably

How Structure Transforms Output

The same underlying task can produce dramatically different quality outputs depending on prompt structure. Here is a real comparison showing how progressively better prompt structure improves a code review task.

Progressive improvement through prompt structure
Prompt structure matters more than prompt length

A well-structured 300-token prompt consistently outperforms a poorly-structured 1000-token prompt. The model needs clarity about what you want, not volume. Invest time in structure (roles, sections, format specifications, examples) rather than writing more words.

Best Practices

Best Practices

Do

  • Put your most important instructions in the system prompt -- it has the highest priority
  • Use XML-style tags to separate instructions from data in user messages
  • Include 3-5 few-shot examples for classification and extraction tasks
  • Structure system prompts with clear sections: role, constraints, format, guardrails
  • Test your prompt with minimal, moderate, and complex inputs to verify consistent behavior

Don’t

  • Don't rely on user messages for critical behavioral constraints -- use the system prompt
  • Don't write unstructured wall-of-text prompts -- use headers, bullets, and sections
  • Don't include contradictory instructions (e.g., 'be concise' and 'be thorough')
  • Don't assume the model will infer your desired output format -- specify it explicitly
  • Don't put instructions inside the data section where they might be confused with input

Key Takeaways

  • System prompts are architecturally privileged -- put your most critical instructions there.
  • User messages should separate instructions from data using clear delimiters (XML tags work best).
  • Few-shot examples in assistant messages teach format by demonstration, more reliably than instructions alone.
  • Prompt structure (role + constraints + format + examples) matters more than prompt length.
  • Progressive improvement: each structural element (system context, formatting, examples) adds measurable quality.

Video on this topic

The anatomy of a perfect AI prompt

tiktok