Context Engineering in Agents
Context engineering is the discipline of curating the smallest set of high-signal tokens that maximize the probability of a good outcome. In LangChain, this means deciding what goes into every model call, how tools read and write state, and what happens between steps — using State, Store, and Runtime Context as your three levers.
Quick Reference
- →Context engineering = smallest possible set of high-signal tokens that maximize the probability of a good outcome
- →Four strategies: Write (inject), Select (retrieve), Compress (summarize), Isolate (sub-agents)
- →Three context types: Model Context (transient, per-call), Tool Context (persistent), Life-cycle Context (between steps)
- →Three data sources: State (conversation-scoped), Store (cross-session), Runtime Context (deploy-time config, read-only)
- →Middleware hook order: before_model runs top→bottom, after_model runs bottom→top
- →SummarizationMiddleware uses message_threshold (not token triggers) — verify before deploying
- →Context rot is real: model accuracy degrades as context grows even within the window limit
What Is Context Engineering
Context engineering is the set of strategies for curating and maintaining the optimal set of tokens during LLM inference. It's broader than prompt engineering, which focuses on how to write a system prompt. Context engineering includes everything that reaches the model: the system prompt, conversation history, tool schemas, retrieved documents, injected data, and what gets compressed or discarded. The goal is not the biggest context — it's the most useful context.
Most agent failures are not model failures — they're context failures. The model saw the wrong information, too much irrelevant information, or information formatted in a way it couldn't act on. Engineering the context fixes these failures; upgrading the model usually doesn't.
Anthropic's engineering team distills context engineering into four moves, each targeting a different root cause of context bloat or context rot. The existing `context-strategies` diagram covers these well:
4 strategies for managing the context window
| Strategy | What It Does | When to Use It |
|---|---|---|
| Write | Inject prompts, tools, few-shot examples into context | Every call — this is the baseline |
| Select | Retrieve only what's relevant via vector search or reranking | When the knowledge base is larger than what fits in context |
| Compress | Summarize or trim conversation history to reduce token count | Long-running agents, conversations > 20 turns |
| Isolate | Delegate subtasks to sub-agents with clean context windows | Parallel workstreams, tasks requiring fresh perspective |