LangChain/Advanced
Advanced16 min

Context Engineering in Agents

Context engineering is the discipline of curating the smallest set of high-signal tokens that maximize the probability of a good outcome. In LangChain, this means deciding what goes into every model call, how tools read and write state, and what happens between steps — using State, Store, and Runtime Context as your three levers.

Quick Reference

  • Context engineering = smallest possible set of high-signal tokens that maximize the probability of a good outcome
  • Four strategies: Write (inject), Select (retrieve), Compress (summarize), Isolate (sub-agents)
  • Three context types: Model Context (transient, per-call), Tool Context (persistent), Life-cycle Context (between steps)
  • Three data sources: State (conversation-scoped), Store (cross-session), Runtime Context (deploy-time config, read-only)
  • Middleware hook order: before_model runs top→bottom, after_model runs bottom→top
  • SummarizationMiddleware uses message_threshold (not token triggers) — verify before deploying
  • Context rot is real: model accuracy degrades as context grows even within the window limit

What Is Context Engineering

Context engineering is the set of strategies for curating and maintaining the optimal set of tokens during LLM inference. It's broader than prompt engineering, which focuses on how to write a system prompt. Context engineering includes everything that reaches the model: the system prompt, conversation history, tool schemas, retrieved documents, injected data, and what gets compressed or discarded. The goal is not the biggest context — it's the most useful context.

Why agents fail

Most agent failures are not model failures — they're context failures. The model saw the wrong information, too much irrelevant information, or information formatted in a way it couldn't act on. Engineering the context fixes these failures; upgrading the model usually doesn't.

Anthropic's engineering team distills context engineering into four moves, each targeting a different root cause of context bloat or context rot. The existing `context-strategies` diagram covers these well:

Context Windowlimited tokens availableWriteInject prompts, tools, docsAdd information into contextsystem msg + few-shot + RAGSelectRetrieve only what's relevantFilter before sending to LLMvector search + rerankingCompressSummarize, trim, reduceFit more into fewer tokenstrim_messages + summarizeIsolateSeparate into subgraphsEach agent gets own contextmulti-agent + subgraphs

4 strategies for managing the context window

StrategyWhat It DoesWhen to Use It
WriteInject prompts, tools, few-shot examples into contextEvery call — this is the baseline
SelectRetrieve only what's relevant via vector search or rerankingWhen the knowledge base is larger than what fits in context
CompressSummarize or trim conversation history to reduce token countLong-running agents, conversations > 20 turns
IsolateDelegate subtasks to sub-agents with clean context windowsParallel workstreams, tasks requiring fresh perspective