Intermediate14 min

Custom State

Production agents carry more than messages. Learn when to extend AgentState, how to design schemas that don't blow up your token budget, and how to avoid the serialization and state-explosion bugs that only show up after you deploy.

Quick Reference

→Default AgentState: messages (list) + remaining_steps (int) — extend, never replace
→state_schema on create_agent: quick path when only tools need the extra fields
→state_schema on a middleware class: preferred when middleware hooks also read/write the fields
→Custom fields are readable at every stage: invoke → middleware → LLM → tool → middleware → response
→Never store raw API responses, documents, or secrets in state — store IDs/references instead
→Unbounded list fields grow per turn and silently inflate token costs — cap or prune them
→Custom state must be JSON-serializable: no lambdas, open file handles, or DB connections

When Custom State Is the Wrong Tool

Before adding a custom field, ask: does this data need to travel through every middleware hook, every tool call, and every LLM pass? If not, it doesn't belong in state. State is not a general-purpose data store — it's a shared context object that gets serialized, logged, and re-injected on every agent turn.

Three things that should never live in state

1. Secrets and tokens — state passes through middleware and tools unencrypted and appears in logs. Use environment variables or a secrets manager and look them up at call time. 2. Raw API responses and full document text — a 50 KB search result dumped into state adds ~12,500 tokens to every subsequent LLM call. Store the document ID and fetch it when needed. 3. Database connections and open file handles — state must be JSON-serializable. A live connection object will fail serialization silently or raise at checkpoint time.

Use case	In state?	Better approach
User ID for the session	Yes	—
User's tier (free/pro/enterprise)	Yes	—
Number of tool calls this turn	Yes (int)	—
Full text of a retrieved document	No	Store doc_id; fetch in tool
Auth token for downstream API	No	Read from env at call time
List of all past search results	No	Store result IDs + summary
Database connection object	No	Instantiate inside the tool

What AgentState Gives You by Default

Every LangChain agent carries an AgentState object through its lifecycle. Two fields come built in: messages holds the full message history with an add_messages reducer (appends rather than replaces), and remaining_steps is an integer that the framework decrements each loop iteration to prevent runaway agents. Your custom TypedDict must subclass AgentState — never replace it — or these built-ins disappear.

Designing Your State Schema

How you structure state fields determines whether your agent is debuggable and cheap to run. Two rules cover 90% of production issues: keep fields flat when possible, and never use a list field that can grow without bound.

Sign in to read this article

This is a premium article. Sign in with your Google account to continue.