Graph RAG

Knowledge graphs enable relationship queries that vector search cannot answer — but full GraphRAG indexing costs $80–130 per 10K pages. This article covers when to use each variant (full GraphRAG, LazyGraphRAG, DIY KG), cost arithmetic, entity extraction and resolution, failure modes, and how to evaluate before shipping.

Quick Reference

→Knowledge graphs store explicit relationships (entity → verb → entity) that embeddings cannot traverse — use when queries involve 'who manages X' or 'what depends on Y'
→Full GraphRAG: 5-step pipeline (extract → build graph → detect communities → summarize → query). Indexing costs $80–130 per 10K pages with gpt-5.4-mini
→LazyGraphRAG (June 2025): same indexing cost as vector RAG (~$0.13/10K pages), defers graph construction to query time — 1000× cheaper to index
→Entity extraction quality drives everything — a noisy extraction produces a noisy graph that returns wrong answers
→Entity resolution (merging 'Alice', 'ALICE Corp', 'Alice from Eng') must happen before graph loading — duplicates create disconnected nodes
→Use the router pattern: classify query type first, then route to graph traversal, vector search, or both
→Always benchmark against vector-only RAG before shipping — Graph RAG doesn't always win, even on relationship queries
→graphrag v3 API: `import graphrag.api as api` → `api.global_search()`, `api.local_search()`, `api.drift_search()`

Should You Use Graph RAG?

Vector stores find documents semantically similar to a query. Knowledge graphs store explicit relationships — 'Alice manages the Platform team', 'Platform team owns the Auth service', 'Auth service depends on PostgreSQL'. These structured relationships enable queries that vector search fundamentally cannot answer: 'What services depend on PostgreSQL?' or 'Who is responsible for the Auth service?'. The tradeoff is steep: building a knowledge graph costs 100–1000× more to index than embedding documents.

Query type	Example	Best tool
Relationship traversal	'Who reports to X?' / 'What depends on Y?'	Graph RAG
Multi-hop path	'What teams are affected if Service Y goes down?'	Graph RAG
Global summarization	'What are the main themes across all docs?'	Full GraphRAG
Semantic lookup	'How does X work?' / 'Find docs about X'	Vector-only
Factoid retrieval	'What is the CEO's name?'	Vector-only
Content comparison	'What changed in policy V3?'	Vector-only

The right default is vector-only

If fewer than 20–25% of your queries involve named relationships between entities, vector search with good chunking will outperform Graph RAG on cost, latency, and maintenance. Measure your query distribution before committing to the Graph RAG pipeline. Most teams who try Graph RAG discover their queries are primarily semantic lookups and revert.

Most teams stop at "No" — vector search handles 90%+ of RAG use cases

Capability	Vector Store	Knowledge Graph
Semantic similarity	Excellent	Not applicable
Relationship traversal	Cannot do	Excellent
Multi-hop reasoning	Difficult	Natural (graph traversal)
Global summarization	Poor (limited to retrieved chunks)	Good (community-level summaries)
Indexing cost (10K pages)	~$0.13	$80–130
Query latency	200–500ms	500ms–3s
Maintenance	Re-embed changed docs	Re-index graph on changes

How Graph RAG Works: Three Variants

Three approaches exist for graph-augmented retrieval, each at a different cost and complexity point. The right choice depends on your query types, corpus size, and indexing budget.

What Graph RAG Actually Costs

The indexing cost of full GraphRAG comes from the number of LLM calls made during entity extraction and community summarization. Here is the arithmetic for 10K pages using gpt-5.4-mini ($1.10/M input tokens, $4.40/M output tokens) as of April 2026.

Sign in to read this article

This is a premium article. Sign in with your Google account to continue.