Evaluator-Optimizer: Self-Improving Loops
One LLM generates, another evaluates — loop until quality threshold met. The Evaluator-Optimizer pattern creates self-improving agents that refine outputs iteratively.
Quick Reference
- →Generator produces output → Evaluator scores it → loop until quality threshold is met
- →The evaluator uses structured output to return scores and actionable feedback
- →Set a max iteration limit to prevent infinite loops and control costs
- →Convergence detection: stop when score plateaus or reaches target threshold
- →Can use the same model for both roles (different prompts) or different models
- →Best for writing, code generation, data extraction — tasks with clear quality criteria
What Is the Evaluator-Optimizer Pattern?
Generator produces → Evaluator scores → loop with feedback until quality threshold met
The Evaluator-Optimizer pattern uses two LLM roles in a loop: a Generator that produces output and an Evaluator that assesses quality and provides structured feedback. The generator revises based on feedback until the evaluator's score meets a threshold — or a max iteration limit is hit.
This is the agent equivalent of a writer-editor relationship. The writer drafts, the editor critiques with specific feedback, the writer revises. Each iteration produces measurably better output because the feedback is structured and actionable.