Advanced RAG/RAG Fundamentals
Intermediate15 min

Vector Database Selection

How to choose a vector database for production RAG in 2026. Six databases compared honestly — quantization changes the cost math, migration is not a one-line change, and most teams will outgrow their first choice at a predictable threshold.

Quick Reference

  • pgvector: runs in your existing Postgres, handles 5-10M vectors with HNSW — beyond that, query latency climbs past 50ms p99
  • Pinecone: zero-ops serverless with BYOC in public preview; enforces 100 req/s per namespace; cold starts 2-10s after idle
  • Qdrant v1.17+: $50M Series B (Mar 2026), Gridstore engine, quantization up to 64x memory reduction — biggest cost lever in 2026
  • Weaviate v1.37: strongest native hybrid search (BM25 + vector) of any open-source vector DB; secure MCP server built in
  • Milvus v2.6: RaBitQ 1-bit quantization at 1/32 original size with 95% recall — most aggressive compression available
  • Chroma Cloud is GA with serverless search — no longer only a dev tool, but scale ceiling above 5M vectors is unproven
  • Migration means re-embedding if dimensions differ, full reindex regardless, and 1-4 weeks of engineering time
  • For most teams in 2026: start with pgvector, enable quantization when memory is tight, migrate to Qdrant or Milvus above 10M vectors

Do You Need a Dedicated Vector Database?

The first production decision is whether you need a vector database at all. Many teams jump to Pinecone or Weaviate before asking whether their current stack can handle the load. The answer depends on three variables: corpus size, query volume, and whether you need hybrid search (vector + keyword).

Need semantic search?NoPostgres Full-Texttsvector + GIN indexYesCorpus > 500Kvectors?NoFAISS / Chromano infra neededYesAlready usingPostgres?Yespgvectorno new infraNoQdrant / Pineconemanaged or self-hosted

Start with need — not store names

The pgvector default

If your application already runs on Postgres, pgvector is the right default. It handles up to 5-10M vectors with HNSW indexing, supports SQL WHERE filters, and costs nothing additional. The only reasons to leave pgvector are: (1) you exceed 10M vectors and query latency climbs above your SLA, (2) you need hybrid search without building it yourself, or (3) you need multi-tenant isolation across thousands of tenants.

When keyword search is enough

If your corpus is English-language structured documents and your users search with domain-specific terms (product names, error codes, legal citations), Postgres full-text search with a GIN index often outperforms vector search. Vector search shines when queries are semantic ('how do I cancel my subscription') not lexical ('cancel subscription API endpoint'). Run both, measure recall on your golden set.