LangChain/Memory & Middleware
Intermediate8 min

Middleware (v1.0+)

LangChain v1.0 introduced middleware — hooks that run before/after model calls. Message trimming, summarization, human-in-the-loop, and custom middleware via AgentMiddleware.

Quick Reference

  • from langchain.agents.middleware import AgentMiddleware, before_model, after_model
  • Middleware runs before/after every model call inside create_agent
  • Built-in: SummarizationMiddleware, ContextEditingMiddleware, HumanInTheLoopMiddleware
  • Custom middleware: subclass AgentMiddleware or use @before_model / @after_model decorators
  • Middleware composes — stack multiple instances in order

What Middleware Is

Interceptors for model calls

Middleware are hooks that run before and/or after every model call in your agent. Think of them as interceptors — they can modify the input, transform the output, or add side effects like logging and rate limiting.

HookWhen It RunsCommon Use
before_agentOnce on invocationLoad memory, validate input
before_modelBefore each LLM callTrim history, redact PII
modify_model_requestJust before LLM callSwap model, adjust tools, set response format
wrap_model_callWraps the full LLM callCaching, retries, dynamic routing
wrap_tool_callWraps each tool executionInject context, gate tool access
after_modelAfter LLM responds, before tools runHuman-in-the-loop, output validation
after_agentOnce on completionSave results, send notifications
agent.invoke()

user input arrives

before_agent

load memory, validate input — runs once

hook
before_model

trim history, redact PII — before each LLM call

hook
modify_model_request

swap model, adjust tools, set response_format

hook
LLM call

waiting for model response…

runs
wrap_model_call

caching, retries, dynamic routing — wraps full LLM call

hook
after_model ↑ reverse order

HITL approval, output validation — runs in reverse

hook
tool call

e.g. get_weather(city='Tokyo')

runs
wrap_tool_call

inject context, gate tool access, intercept result

hook
after_agent

save results, send notifications — runs once

hook
response returned

agent.invoke() completes

middleware wraps every model + tool call — after_model runs in reverse order

Middleware composes: stack multiple instances and they run in order on the way in, then reverse order on the way out — the same onion pattern as Express.js or Python ASGI middleware.