Agent Architecture/Agent Memory
Advanced20 min

LangMem SDK & Store API

LangMem is an LLM-powered extraction layer that automatically identifies and persists structured facts from conversations. This article covers when to use it (and when not to), all three APIs with correct signatures, cost analysis, memory quality evaluation, failure modes, and GDPR deletion.

Quick Reference

  • LangMem uses a full LLM call to extract structured memories — every invocation costs money and adds latency
  • Three API tiers: create_memory_manager (stateless), create_memory_store_manager (stateful, recommended), and memory tools (agent-driven)
  • Model strings require provider prefix: "anthropic:claude-sonnet-4-6", not "claude-sonnet-4-6"
  • The schemas parameter accepts Pydantic models to enforce typed extraction (UserPreference, UserFact, etc.)
  • create_memory_store_manager handles search-extract-persist in one call — the recommended default for production
  • Memory tools (create_manage_memory_tool + create_search_memory_tool) let the agent decide what to remember and when to search
  • Each extraction with Sonnet: ~$0.009 for 10-msg convos, ~$0.044 for 50-msg convos — use Haiku for 15× cheaper extraction
  • Always run extraction as a background task — never block the user-facing response path on a memory LLM call

Should You Use LangMem?

LangMem is a library that uses an LLM call to extract structured memories from conversations. Every extraction is an LLM call. Before adopting it, answer two questions: (1) Is memory extraction complex enough to warrant an LLM, or can you write a 20-line deterministic parser? (2) Can you afford an extra LLM call per conversation turn or per conversation end?

Is memory extraction complex enoughto need an LLM call?NoManual store.put()deterministic, no costYesShould the agent decide whatto remember autonomously?YesMemory Tools APIcreate_manage_memory+ search_memory_toolNoDo you need full control overpersistence / custom storage logic?Yescreate_memory_managerstateless, you own persistenceNocreate_memory_store_managerstateful — search + extract + persist✓ recommended for most production apps

choose based on how much lifecycle management you want LangMem to own

Use LangMem when...Skip LangMem when...
You need to extract nuanced facts requiring language understanding ("I prefer responses that feel collegial but precise")Your memory needs are 3-5 simple preference fields (language, timezone) — a deterministic parser is cheaper
You want automatic conflict resolution across extraction calls (enable_updates=True)You need sub-second memory updates — LangMem extraction is 3-60 seconds
You use LangGraph and want native BaseStore integrationYou are framework-agnostic and don't want the LangChain dependency
You need structured memory schemas (Pydantic models) to enforce extraction typesYour daily conversation volume makes per-conversation LLM calls cost-prohibitive at the cheapest model
You want agent-driven memory where the agent decides what to remember (tools API)You already have a working extraction prompt and just need store.put() calls
LangMem is an LLM call, not a database call

Each create_memory_manager invocation sends your conversation + existing memories to an LLM and waits for structured output. On longer conversations, reported p95 latency can reach 60 seconds. Budget accordingly and always run extraction off the user-facing path.

Real project

A team building a customer support agent initially used LangMem to extract all user facts. After 2 weeks of LangSmith traces, they found 80% of memory writes were simple preference updates (language, timezone) that a deterministic parser handled in <10ms. They kept LangMem only for the remaining 20% — complex facts requiring LLM reasoning — and cut their memory extraction costs by 4×.

Learn this in → cost-analysis