Intermediate16 min

Embedding Models Compared

The 2026 embedding landscape has 4 commercial providers, a tier of free open-weight models that match commercial quality, and Matryoshka compression that cuts storage 92% with minimal recall loss. This article covers how to choose, evaluate, and migrate embedding models for production RAG.

Quick Reference

→text-embedding-3-small / voyage-4-lite: $0.02/1M tokens — best default for English RAG
→Voyage 4 family (Jan 2026): shared embedding space across tiers, voyage-4-nano is free + open-weight
→Cohere Embed v4: multimodal (text + images), 1024 dims, $0.12/1M tokens
→Google Gemini Embedding 2: 5 input types (text, image, video, audio, docs), tops MTEB v1 English
→Open-source leaders (2026): Jina v5, BGE-M3, Qwen3-Embedding-8B — competitive with commercial on benchmarks
→Matryoshka: cut 3072 → 256 dims for 92% storage savings, ~6% recall loss — measure on your data
→MTEB v2 scores ≠ MTEB v1 — never compare numbers across benchmark versions

The 2026 Embedding Landscape

In 2024, the embedding model choice was: OpenAI, Cohere, or self-host BGE. In 2026, there are three distinct tiers, four major commercial providers, and open-weight models that trade blows with commercial APIs on benchmarks. The most important development: multimodal embeddings are production-ready. Cohere Embed v4, Google Gemini Embedding 2, and Voyage multimodal-3.5 all handle mixed text-and-image content in a single embedding space. If your RAG corpus contains PDFs with figures, product images, or screenshots, single-model multimodal retrieval is now a realistic option.

embedding provider tiers · 3 deployment modes · April 2026

Model	Dims	MTEB v1 (Eng)	Price/1M	Multimodal	Context
text-embedding-3-small	1536	62.3	$0.02	No	8K
voyage-4-lite	1024	~63	$0.02	No	32K
voyage-4	1024	~68	$0.06	No	32K
Cohere Embed v4	1024	—	$0.12	Text + image	128K
Gemini Embedding 2	3072	68.32	$0.20 (free preview)	5 types	8K
text-embedding-3-large	3072	64.6	$0.13	No	8K

MTEB v1 and v2 scores are not comparable

MTEB v2 launched in 2026 with a different task mix and scoring methodology. A model scoring 72 on MTEB v2 cannot be compared to a model scoring 68 on MTEB v1. Always check which benchmark version a score refers to before using it to choose a model.

Commercial Embedding Models

OpenAI's text-embedding-3 family remains the safe default — well-documented, easy LangChain integration, predictable pricing. Voyage AI (acquired by MongoDB in 2025) is now the most interesting commercial option: the Voyage 4 family uses a shared embedding space across all four tiers, meaning you can index with voyage-4-large and query with voyage-4-lite while still getting accurate results — a property no other provider currently offers. Google Gemini Embedding 2 tops the MTEB v1 English leaderboard and supports five input types natively; pricing is still free in preview as of April 2026, but treat that as temporary.

Open-Source & Open-Weight Models

The 2026 open-source tier is a different story from 2024. The 2023-era models (BGE-large-en-v1.5, E5-large-v2, GTE-large) are no longer representative of the state of the art. Jina v5 small beats most models under 1B parameters on MTEB v2 while supporting 32K context and 119 languages. Qwen3-Embedding-8B leads the open-source MTEB rankings. voyage-4-nano is fully open-weight under Apache 2 and shares the Voyage 4 embedding space, meaning it can be used as a free local alternative for latency-sensitive inference while production indexing uses a larger model.

Sign in to read this article

This is a premium article. Sign in with your Google account to continue.