Embedding Models Compared
Comparing OpenAI, Cohere, and open-source embedding models for RAG. Dimensions, pricing, MTEB benchmarks, and Matryoshka embeddings for cost optimization.
Quick Reference
- →text-embedding-3-small: best cost/quality ratio for most RAG use cases ($0.02/1M tokens)
- →text-embedding-3-large: highest quality from OpenAI, supports dimension reduction ($0.13/1M tokens)
- →Cohere embed-v3: strongest multilingual support with 100+ languages
- →Open-source BGE/E5/GTE: self-hosted, no API costs, competitive quality on MTEB
- →Matryoshka embeddings let you truncate dimensions (3072 → 256) to save 90%+ storage with ~5% quality loss
OpenAI Embedding Models
OpenAI's text-embedding-3 family is the most widely used in production RAG systems. The 'small' variant offers an excellent cost-to-quality ratio and is sufficient for the majority of use cases. The 'large' variant scores higher on benchmarks but costs 6.5x more. Both support Matryoshka dimension reduction, letting you trade a small amount of quality for significant storage savings.
| Model | Dimensions | MTEB Score | Price/1M tokens | Max Tokens |
|---|---|---|---|---|
| text-embedding-3-small | 1536 | 62.3 | $0.02 | 8191 |
| text-embedding-3-large | 3072 | 64.6 | $0.13 | 8191 |
| text-embedding-ada-002 (legacy) | 1536 | 61.0 | $0.10 | 8191 |
For 95% of RAG use cases, text-embedding-3-small at $0.02/1M tokens is the right choice. Embedding a 10,000-page knowledge base costs roughly $2-5. Only upgrade to 'large' if you've measured a meaningful retrieval quality difference on your specific dataset.