Eval in CI/CD: Pytest & Vitest Integration

LangSmith's pytest plugin and Vitest/Jest integration bring LLM evaluation into your CI/CD pipeline — with fuzzy matching, embedding distance, test caching, and rich terminal output.

Quick Reference

→@pytest.mark.langsmith decorator syncs test cases with LangSmith datasets automatically
→expect() utility provides fuzzy matching: edit distance, embedding similarity, semantic match
→Test caching skips unchanged examples in CI — only re-evaluates modified tests
→Rich terminal output shows pass/fail with scores and diffs inline
→Vitest/Jest integration for JavaScript/TypeScript projects with identical capabilities
→Results sync to LangSmith for tracking regressions across commits

Why Eval in CI/CD?

LLM outputs are non-deterministic — the same input can produce different outputs across runs. Traditional unit tests with exact string matching break constantly. LangSmith's testing framework solves this with fuzzy assertions (embedding distance, semantic similarity) and statistical evaluation (pass rates across datasets, not individual examples).

Traditional Tests	LLM Eval Tests
assert output == 'exact string'	expect(output).to_semantic_match('meaning')
Pass/fail binary	Score-based with thresholds
Deterministic	Statistical (pass rate across dataset)
Run every time	Cache unchanged examples
Local only	Sync results to LangSmith

Pytest Integration

LangSmith pytest plugin — test your agent in CI

The expect() Utility

Assertion	What It Checks	Use For
to_equal(value)	Exact match	Classification labels, structured output fields
to_contain(substring)	Substring presence	Key terms that must appear
to_semantic_match(text, threshold)	Embedding cosine similarity	Meaning match regardless of wording
to_be_greater_than(n)	Numeric comparison	Response length, score thresholds
to_satisfy(predicate_fn)	Custom function returns True	Complex custom validation logic
edit_distance(text, threshold)	Levenshtein distance	Close but not exact matches

Sign in to read this article

This is a premium article. Sign in with your Google account to continue.