Verification Criteria and Self-Testing Prompts

If Claude can verify its own output, quality goes up dramatically. Building verification criteria into the prompt — not as a follow-up — turns implementation requests into implicit contracts Claude holds itself to.

Quick Reference

→Build verification into the prompt: 'before finishing, run npm test and fix any failures'
→TDD loop: write failing tests first → implement → run until passing
→Acceptance criteria format: 'done when: (1) X succeeds, (2) Y returns Z, (3) all existing tests pass'
→Playwright MCP: Claude verifies UI changes in a real browser
→The implicit contract: Claude doesn't stop until criteria are met
→Self-testing prompts eliminate the review cycle for mechanical correctness

Why Verification Changes Output Quality

Claude produces better output when it knows it will verify the output. This is not unique to AI — it mirrors how human developers perform when they know their work will be immediately tested vs reviewed days later. The act of specifying how success will be verified forces precision in the implementation.

Without verification criteria, Claude finishes when the code looks correct. With verification criteria, Claude finishes when the code is correct — and if the first attempt fails verification, Claude iterates until it passes.

Building Verification Into the Prompt

Add verification instructions to the prompt, not as a follow-up after you see the output. Follow-up verification requests apply to already-written code; in-prompt verification applies to the writing process itself.

The TDD Verification Loop

The TDD verification loop is one of the highest-quality Claude Code patterns: write failing tests first, then ask Claude to implement until they pass. The tests encode exactly what 'done' means — Claude cannot finish early.

Sign in to read this article

This is a premium article. Sign in with your Google account to continue.