Advanced10 min
Prompt Versioning & A/B Testing
Managing prompts as code: version control, A/B testing frameworks, rollback strategies, and measuring prompt performance in production.
Quick Reference
- →Store prompts in version-controlled files (not inline strings) with semantic versioning and changelogs
- →A/B test prompts by routing a percentage of traffic to the new version and comparing metrics (task completion, latency, cost)
- →Use LangSmith's prompt hub or a custom registry to manage prompt versions across environments (dev, staging, prod)
- →Track prompt performance metrics: task completion rate, average tool calls per task, user satisfaction, and cost per conversation
- →Implement instant rollback: if a new prompt version degrades metrics, revert to the previous version without a code deploy
Prompts as Code
Prompts are the most frequently changed component of an agent system — often daily. Treating them as code with proper version control prevents the chaos of ad-hoc edits:
Prompt file with version metadata
| Approach | Pros | Cons |
|---|---|---|
| Inline strings in code | Simple, version-controlled with code | Changing a prompt requires a code deploy |
| YAML/JSON files in repo | Separate from code, version-controlled | Still requires a deploy to update |
| LangSmith Prompt Hub | Hot-swap without deploy, versioned, shareable | Vendor dependency, requires LangSmith account |
| Custom prompt registry (DB) | Full control, hot-swap, custom metadata | More infrastructure to build and maintain |