Middleware (v1.0+)
LangChain v1.0 introduced middleware — hooks that run before/after model calls. Message trimming, summarization, human-in-the-loop, and custom middleware via AgentMiddleware.
Quick Reference
- →from langchain.agents.middleware import AgentMiddleware, before_model, after_model
- →Middleware runs before/after every model call inside create_agent
- →Built-in: SummarizationMiddleware, ContextEditingMiddleware, HumanInTheLoopMiddleware
- →Custom middleware: subclass AgentMiddleware or use @before_model / @after_model decorators
- →Middleware composes — stack multiple instances in order
What Middleware Is
Middleware are hooks that run before and/or after every model call in your agent. Think of them as interceptors — they can modify the input, transform the output, or add side effects like logging and rate limiting.
| Hook | When It Runs | Common Use |
|---|---|---|
| before_agent | Once on invocation | Load memory, validate input |
| before_model | Before each LLM call | Trim history, redact PII |
| modify_model_request | Just before LLM call | Swap model, adjust tools, set response format |
| wrap_model_call | Wraps the full LLM call | Caching, retries, dynamic routing |
| wrap_tool_call | Wraps each tool execution | Inject context, gate tool access |
| after_model | After LLM responds, before tools run | Human-in-the-loop, output validation |
| after_agent | Once on completion | Save results, send notifications |
user input arrives
load memory, validate input — runs once
trim history, redact PII — before each LLM call
swap model, adjust tools, set response_format
waiting for model response…
caching, retries, dynamic routing — wraps full LLM call
HITL approval, output validation — runs in reverse
e.g. get_weather(city='Tokyo')
inject context, gate tool access, intercept result
save results, send notifications — runs once
agent.invoke() completes
middleware wraps every model + tool call — after_model runs in reverse order
Middleware composes: stack multiple instances and they run in order on the way in, then reverse order on the way out — the same onion pattern as Express.js or Python ASGI middleware.