LangGraph/Control Flow
Intermediate14 min

Pre/Post Model Hooks

The right place to intercept every LLM call is not inside your nodes. LangGraph's pre_model_hook and post_model_hook, and LangChain 1.0's AgentMiddleware, give you a composable layer for context trimming, guardrails, cost tracking, and output validation — without polluting business logic.

Quick Reference

  • pre_model_hook runs before every LLM call — trim context, validate input, inject dynamic prompts
  • post_model_hook runs after every LLM call — track tokens, validate output, audit log
  • SummarizationMiddleware replaces manual trimming — configure trigger=("tokens", N) and it handles message-pair integrity
  • Hooks are a latency multiplier: every hook adds time to every LLM call — keep logic under 5ms, DB queries belong in nodes
  • AgentMiddleware composes: before_model runs forward through the list, after_model runs in reverse
  • Exceptions in hooks abort the agent.invoke() call — always catch anticipated failures explicitly
  • create_react_agent hooks (v2) are LangGraph-native; AgentMiddleware on create_agent is the LangChain 1.0 future

When (Not) to Intercept Model Calls

Before writing a hook, answer one question: does this concern need to run before or after every LLM call in this agent? If yes — it's cross-cutting, and a hook is the right place. If no — if it's conditional, depends on specific state, or routes between nodes — it belongs in a dedicated node or edge. Hooks that grow beyond lightweight cross-cutting concerns become invisible complexity: they run on every call, they compose in ways that aren't obvious from reading the graph, and they fail in ways the graph can't retry.

Is this concern cross-cutting?fires before/after every LLM call?NoYesDedicated nodecomplex logic, routingDB queries, heavy computationNeed composability?multiple concerns stackingNoYespre/post_model_hookon create_react_agentsingle concern, LangGraph-nativeAgentMiddlewareon create_agentcomposable, LangChain 1.0

choose before you write a line of hook code

The three-line test for hooks

A hook earns its place if: (1) it should run on every LLM call in this agent, (2) it doesn't need its own retry or error-handling path, and (3) it completes in under 5ms. Fail any of those three, and the logic belongs in a node.

Good candidates: trimming messages to fit the context window, injecting a current timestamp into the system prompt, logging token counts, redacting PII from inputs before they reach the model. Bad candidates: checking a database to decide which tool to enable, calling an external API for rate-limiting, running a slow embedding call to retrieve context. Those last three need their own nodes — with explicit retry policies, error branches, and observable state.