LangChain
v1.2The developer interface for building with LLMs. One API for every model, composable chains, tools, memory, and structured output.
The ecosystem, the architecture, and why it exists. One API for every LLM, composable chains via LCEL, and the stability guarantees of v1.0+.
Everything that changed in LangChain v1: create_agent replaces create_react_agent, ToolRuntime replaces InjectedState, middleware replaces hooks, and TypedDict is the only state type.
Three tools, one ecosystem. LangChain is the framework, LangGraph is the runtime, Deep Agents is the batteries-included harness. Here is when to use each.
How ChatModel works under the hood. Provider packages, model initialization, streaming, and the invoke/ainvoke interface.
HumanMessage, AIMessage, SystemMessage, ToolMessage — the building blocks of every LangChain conversation.
Messages are the fundamental unit of context in LangChain. HumanMessage, AIMessage, SystemMessage, and ToolMessage carry content, tool calls, and metadata through every model interaction.
Provider-agnostic access to reasoning traces, citations, images, and text via the new content_blocks property on messages — no more per-provider parsing.
The pipe operator (|) composes Runnables into chains. Lazy evaluation, type safety, and the full Runnable interface.
RunnableParallel, RunnableBranch, RunnableLambda, fallbacks, retry logic, and dynamic routing within LCEL chains.
The callback system, BaseCallbackHandler, on_llm_start/end, on_tool_start/end, tracing integration, custom instrumentation.
Configure temperature, max_tokens, retries, timeouts, and rate limiting when initializing a model. Track token usage across multiple models with UsageMetadataCallbackHandler.
Process multiple independent inputs in parallel with .batch(). Use batch_as_completed() to stream results as they finish and max_concurrency to control parallelism.
Create a single model instance that can be swapped at runtime via config. Use configurable_fields to expose temperature, model name, and provider as runtime parameters — no code changes needed.
Pass images, audio, and files to multimodal models using content blocks. Build mixed text-and-image messages with HumanMessage content arrays — no special model class required.
Reasoning models (o3, Claude with extended thinking) emit internal thought steps before the final answer. Access reasoning via content_blocks, control effort with budget_tokens, and stream thinking tokens in real time.
Some providers (Anthropic, OpenAI) offer built-in tools like web search that execute server-side — the provider runs them, not your code. Bind them the same way as local tools; results come back as server_tool_result content blocks.
Run models locally with Ollama — no API keys, no network calls, no data leaving your machine. The same init_chat_model() interface works; swap the provider prefix and everything else stays identical.
When a model returns tool calls, execute them and pass results back as ToolMessages. This three-step cycle — invoke → execute → invoke again — is the foundation of every tool-using agent.
ChatPromptTemplate, MessagesPlaceholder, few-shot prompting, and variable injection — everything you need to write prompts that produce consistent results.
with_structured_output() turns any model into a typed data extractor. Pydantic schemas, JSON mode, and provider-specific strategies.
StrOutputParser is the everyday default. JsonOutputParser is the fallback for streaming JSON or models without tool calling. PydanticOutputParser is legacy — use with_structured_output() instead.
The @tool decorator, BaseTool, tool schemas from docstrings, bind_tools(), and the tool-call message cycle.
The extras attribute (v1.2), Anthropic programmatic tool calling, OpenAI strict schemas, and advanced tool patterns.
Not every tool should be available in every situation. Filter tools at runtime based on auth state, permissions, feature flags, or conversation stage using @wrap_model_call middleware.
By default, tool errors crash the agent. Use @wrap_tool_call to intercept failures, return actionable error messages, and implement retry logic — all without touching the tool itself.
How tool names, descriptions, and error surfaces affect model selection accuracy. Designing composable, token-efficient tools that help the model choose correctly and chain reliably.
How parallel tool calling works across providers, executing concurrent tool calls with asyncio, handling dependencies and failures, and optimizing the tradeoff between round trips and token cost.
ToolNode is the prebuilt LangGraph node that executes tools in a graph. ToolRuntime gives tools access to conversation state, immutable context, long-term store, and streaming — without those values appearing in the tool's schema.
How LangChain handles conversation memory. RunnableWithMessageHistory wraps any chain to automatically persist and inject session history — no magic classes, just explicit messages.
Long conversations exceed context windows. Use @before_model middleware to trim or rebuild history, RemoveMessage to delete specific messages, and SummarizationMiddleware to compress old turns into a summary.
BaseChatMessageHistory is the interface every storage backend implements. Swap from in-memory to Redis or SQL with a one-line change in your factory function.
LangChain v1.0 introduced middleware — hooks that run before/after model calls. Message trimming, summarization, human-in-the-loop, and custom middleware via AgentMiddleware.
LangChain and Deep Agents ship 15+ production-ready middleware for reliability, cost control, security, and agentic capabilities. Use them individually or stacked to cover cross-cutting concerns without touching your agent's core logic.
Build custom middleware with node-style hooks (before/after) for state updates and wrap-style hooks (wrap_model_call, wrap_tool_call) for retry, caching, and request mutation. Use request.override() to change the model or tools per call.
create_agent builds a graph-based agent runtime on top of LangGraph. Give it a model and tools — it handles the reasoning loop, tool dispatch, and stopping conditions.
Shape how your agent approaches tasks with a system prompt. Static strings for fixed personas, SystemMessage for provider features like prompt caching, and @dynamic_prompt for runtime-generated prompts.
Agents track more than messages. Extend AgentState with custom fields to carry user preferences, task progress, or any data your tools and middleware need across the conversation.
Route to cheaper models for simple turns and powerful models for complex ones. @wrap_model_call intercepts every LLM request and lets you swap the model based on state, context, or cost targets.
Surface real-time agent progress to users. Choose stream_mode='updates' for step-by-step progress, 'messages' for LLM tokens, or 'custom' for arbitrary signals from inside tools. Pass version='v2' for a unified chunk format.
Make agents return typed Pydantic objects, dataclasses, or dicts instead of free text. Use ProviderStrategy for native schema enforcement or ToolStrategy for any tool-calling model — with automatic validation retries built in.
Loading data from PDF, CSV, Notion, Slack, Google Drive, web pages. The DocumentLoader interface, lazy_load(), aload().
RecursiveCharacterTextSplitter, chunk_size, chunk_overlap, splitting strategies for different content types.
The Embeddings interface, embed_documents(), embed_query(), choosing a model (text-embedding-3-small vs large), dimensionality.
Storing and querying embeddings, similarity_search(), Pinecone, Chroma, pgvector, FAISS. When to use which.
Retriever vs VectorStore, as_retriever(), custom retrievers, contextual compression, multi-query retriever, ensemble retriever.
Context engineering is the #1 job of AI engineers. LangChain's agent abstractions are built around three context types (Model, Tool, Life-cycle) and three data sources (State, Store, Runtime Context).
Validate and filter agent inputs and outputs using middleware hooks. Use before_agent for session-level input checks, after_agent for final output safety, and layer deterministic (regex) + model-based (LLM) guardrails for defense in depth.
The Runtime object provides dependency injection for tools and middleware. Pass context_schema to create_agent, inject per-invocation data (user ID, connections) via context=, and access it anywhere via runtime.context.
MCP is an open protocol for exposing tools, resources, and prompts to LLMs. Use langchain-mcp-adapters to connect any MCP server to a LangChain agent.
ToolRuntime replaces InjectedState, InjectedStore, and InjectedConfig with a single typed parameter — giving tools access to state, context, store, stream_writer, and tool_call_id.