Agent Architecture/Single-Agent Patterns
Advanced9 min

Evaluator-Optimizer: Self-Improving Loops

One LLM generates, another evaluates — loop until quality threshold met. The Evaluator-Optimizer pattern creates self-improving agents that refine outputs iteratively.

Quick Reference

  • Generator produces output → Evaluator scores it → loop until quality threshold is met
  • The evaluator uses structured output to return scores and actionable feedback
  • Set a max iteration limit to prevent infinite loops and control costs
  • Convergence detection: stop when score plateaus or reaches target threshold
  • Can use the same model for both roles (different prompts) or different models
  • Best for writing, code generation, data extraction — tasks with clear quality criteria

What Is the Evaluator-Optimizer Pattern?

Task InputGeneratorProduces output (+ uses feedback)EvaluatorScores quality + actionable feedbackScore ≥ 8?YesFinal OutputNoFeedback+ revise

Generator produces → Evaluator scores → loop with feedback until quality threshold met

Definition

The Evaluator-Optimizer pattern uses two LLM roles in a loop: a Generator that produces output and an Evaluator that assesses quality and provides structured feedback. The generator revises based on feedback until the evaluator's score meets a threshold — or a max iteration limit is hit.

This is the agent equivalent of a writer-editor relationship. The writer drafts, the editor critiques with specific feedback, the writer revises. Each iteration produces measurably better output because the feedback is structured and actionable.