LangGraph/Advanced
Advanced10 min

Node Caching

Cache expensive node results with CachePolicy on add_node(). Cache backends for dev vs. production, custom key_func, and the pickle deserialization CVE you need to patch.

Quick Reference

  • CachePolicy(ttl=N): set per-node on add_node() — NOT on compile()
  • compile(cache=InMemoryCache()): supply the cache backend separately from the checkpointer
  • key_func: custom function to control which state fields form the cache key (default: pickle hash of full input)
  • @task(cache_policy=CachePolicy(ttl=N)): functional API equivalent of per-node caching
  • __metadata__['cached']: True in stream_mode='updates' confirms a cache hit
  • cache.clear(): manually invalidate all entries — required after deploying new node logic

When to Use Node Caching (and When Not To)

Pure node?no side effects, deterministicNoDo not cacheside effects breakYesExpensive?>$0.01/call or >500msNoSkip cachingoverhead > savingsYesInputs frequently repeated?retry loops, duplicate queriesNoSkip cachinglow hit rateYes → Cache it. Pick backend:InMemoryCachedev / testing onlylost on restartnot shared across workersSqliteCachelocal persistencesingle processok for single-workerPostgresCacheproductionmulti-worker safesurvives restarts

Should I cache this node? Pure → Expensive → Repeated inputs → pick backend

Node TypeCache?Why
LLM callsYesExpensive, often repeated with same input in retry loops
Embedding computationYesDeterministic and expensive at scale — same text always produces same vector
Idempotent API lookupsMaybeOnly if response is stable — reference data yes, live prices no
DB writesNeverSide effect — caching skips the write, data never gets persisted
Time-dependent logicNeverStale cache returns wrong results — current time, market prices, live feeds
Random/samplingNeverCaching defeats randomness — same input should produce different outputs
Never cache nodes with side effects

The cached result replaces execution — the side effect will not happen. If a node sends an email, writes to a database, or charges a payment, caching it means the action silently disappears. The graph sees a successful result; the real-world effect never occurred.

Caching has the highest payoff in two scenarios: iterative agent loops (the same node called multiple times with unchanged input when a tool call fails and retries) and multi-user workloads (thousands of users asking similar questions within the TTL window). Outside those two cases, run the cost math first before adding caching complexity.