Advanced RAG/Advanced Patterns
Advanced18 min

Graph RAG

Knowledge graphs enable relationship queries that vector search cannot answer — but full GraphRAG indexing costs $80–130 per 10K pages. This article covers when to use each variant (full GraphRAG, LazyGraphRAG, DIY KG), cost arithmetic, entity extraction and resolution, failure modes, and how to evaluate before shipping.

Quick Reference

  • Knowledge graphs store explicit relationships (entity → verb → entity) that embeddings cannot traverse — use when queries involve 'who manages X' or 'what depends on Y'
  • Full GraphRAG: 5-step pipeline (extract → build graph → detect communities → summarize → query). Indexing costs $80–130 per 10K pages with gpt-5.4-mini
  • LazyGraphRAG (June 2025): same indexing cost as vector RAG (~$0.13/10K pages), defers graph construction to query time — 1000× cheaper to index
  • Entity extraction quality drives everything — a noisy extraction produces a noisy graph that returns wrong answers
  • Entity resolution (merging 'Alice', 'ALICE Corp', 'Alice from Eng') must happen before graph loading — duplicates create disconnected nodes
  • Use the router pattern: classify query type first, then route to graph traversal, vector search, or both
  • Always benchmark against vector-only RAG before shipping — Graph RAG doesn't always win, even on relationship queries
  • graphrag v3 API: `import graphrag.api as api` → `api.global_search()`, `api.local_search()`, `api.drift_search()`

Should You Use Graph RAG?

Vector stores find documents semantically similar to a query. Knowledge graphs store explicit relationships — 'Alice manages the Platform team', 'Platform team owns the Auth service', 'Auth service depends on PostgreSQL'. These structured relationships enable queries that vector search fundamentally cannot answer: 'What services depend on PostgreSQL?' or 'Who is responsible for the Auth service?'. The tradeoff is steep: building a knowledge graph costs 100–1000× more to index than embedding documents.

Query typeExampleBest tool
Relationship traversal'Who reports to X?' / 'What depends on Y?'Graph RAG
Multi-hop path'What teams are affected if Service Y goes down?'Graph RAG
Global summarization'What are the main themes across all docs?'Full GraphRAG
Semantic lookup'How does X work?' / 'Find docs about X'Vector-only
Factoid retrieval'What is the CEO's name?'Vector-only
Content comparison'What changed in policy V3?'Vector-only
The right default is vector-only

If fewer than 20–25% of your queries involve named relationships between entities, vector search with good chunking will outperform Graph RAG on cost, latency, and maintenance. Measure your query distribution before committing to the Graph RAG pipeline. Most teams who try Graph RAG discover their queries are primarily semantic lookups and revert.

Relationshipqueries needed?NoYesVector-only RAG~$0.13 / 10K pagesNeed globalsummaries?NoYesLazyGraphRAGsame cost as vectorFull GraphRAG$80–130 / 10K pages

Most teams stop at "No" — vector search handles 90%+ of RAG use cases

CapabilityVector StoreKnowledge Graph
Semantic similarityExcellentNot applicable
Relationship traversalCannot doExcellent
Multi-hop reasoningDifficultNatural (graph traversal)
Global summarizationPoor (limited to retrieved chunks)Good (community-level summaries)
Indexing cost (10K pages)~$0.13$80–130
Query latency200–500ms500ms–3s
MaintenanceRe-embed changed docsRe-index graph on changes