Intermediate9 min
Guardrails
Validate and filter agent inputs and outputs using middleware hooks. Use before_agent for session-level input checks, after_agent for final output safety, and layer deterministic (regex) + model-based (LLM) guardrails for defense in depth.
Quick Reference
- →before_agent — runs once per invocation; use for input blocking and auth checks
- →after_agent — runs once after the loop ends; use for final output safety scanning
- →return {'jump_to': 'end'} — short-circuit the agent immediately from any hook
- →Deterministic guardrails: regex/keywords — fast and cheap
- →Model-based guardrails: LLM classifier — catches semantic violations
Deterministic vs Model-Based
Guardrails fall into two categories. Deterministic guardrails use rule-based logic — regex, keyword matching, schema validation. They're fast and free but can be bypassed with rephrasing. Model-based guardrails use an LLM to evaluate content semantically — they catch nuanced violations but add latency and cost. Layer both for defense in depth.
| Approach | Strengths | Weaknesses |
|---|---|---|
| Deterministic (regex/keywords) | Fast, free, predictable, auditable | Bypassed by paraphrasing, no semantic understanding |
| Model-based (LLM classifier) | Catches semantic violations, nuanced | Adds latency and cost per call |