Retrievers
Retrievers wrap vector stores in LangChain's Runnable interface — but choosing the wrong one costs latency and money. Decision framework, cost math, and evaluation code for MultiQuery, Parent, SelfQuery, Compression, Ensemble, and custom retrievers.
Quick Reference
- →Retriever = Runnable that accepts a string query and returns List[Document]
- →vectorstore.as_retriever() wraps any VectorStore in the Runnable interface
- →LangChain v1: retriever classes moved to langchain_classic — install langchain-classic to keep them
- →MultiQueryRetriever: +1 LLM call to generate N query variations, then N+1 vector searches
- →ContextualCompressionRetriever: +k LLM calls where k = documents retrieved — expensive at scale
- →EnsembleRetriever: zero extra LLM calls — BM25 + vector merged with Reciprocal Rank Fusion
- →ParentDocumentRetriever: zero extra LLM calls — retrieves child chunks, returns parent context
- →Evaluate retrieval separately from generation: bad retrieval guarantees bad answers
Retriever vs. VectorStore
A VectorStore holds data and knows how to search it. A Retriever is a Runnable that wraps any search source — including a VectorStore — and exposes a single interface: give it a string, get back a List[Document]. That interface is what lets retrievers plug into LCEL chains with the | operator.
The Runnable interface gives retriever.invoke(), retriever.ainvoke(), retriever.batch(), and retriever.abatch() for free. You can swap the underlying retriever implementation without touching the chain.