The Auto Mode Safety Classifier
Auto mode's safety classifier is what separates it from simply skipping permissions. Before every tool call, a model evaluates whether the action is safe given your environment. Understanding how it works lets you tune it for your project.
Quick Reference
- →Classifier runs before every tool call in auto mode
- →Uses Sonnet 4.6 to evaluate: tool + args + environment context + session history
- →Three outcomes: auto-approve, advisory block (soft_deny), hard block
- →Available on Max (Opus 4.7 only), Team, Enterprise, API (Sonnet 4.6+ or Opus 4.6+)
- →Tune with autoMode config: environment description, allow list, soft_deny list
- →Three diagnostic commands: claude auto-mode defaults, config, critique
- →Entered research preview March 2026 — thresholds may shift between versions
- →Use auto mode for trusted autonomous tasks; dontAsk for scripted pipelines
What the Classifier Does
In auto mode, before Claude Code executes any tool call, a classifier reviews it. The classifier is itself a model run — it receives the tool name, its arguments, your environment description, and the session history. It returns one of three outcomes: approve the action, advisory-block it (soft deny), or hard-block it.
This is the key difference from --dangerously-skip-permissions. That flag turns off all gates. Auto mode replaces static permission prompts with a dynamic judgment call: is this specific action, in this specific context, safe to run right now?
The safety classifier is a separate model call. It runs on Sonnet 4.6 regardless of which model you're using for the main task. This adds a small latency overhead per tool call but is what makes auto mode viable for longer tasks.