Graph RAG
Knowledge graphs for RAG: structured relationships vs semantic similarity, Microsoft's Graph RAG approach, building knowledge graphs from documents, and combining graph traversal with vector search.
Quick Reference
- →Knowledge graphs store explicit relationships (entity → relationship → entity) that embeddings miss
- →Graph RAG excels at: multi-hop reasoning, relationship queries, and global summarization
- →Microsoft's Graph RAG: community detection on the KG, then hierarchical summarization for global questions
- →Entity extraction + relationship extraction = automated KG construction from unstructured text
- →Hybrid graph+vector retrieval combines structured traversal with semantic similarity
Knowledge Graphs vs Vector Stores
Vector stores find documents that are semantically similar to a query. Knowledge graphs store explicit relationships between entities — 'Alice manages the Platform team', 'Platform team owns the Auth service', 'Auth service depends on PostgreSQL'. These structured relationships enable queries that vector search fundamentally cannot answer, like 'What services depend on PostgreSQL?' or 'Who is responsible for the Auth service?'. The tradeoff: building a knowledge graph is significantly more complex than embedding documents.
| Capability | Vector Store | Knowledge Graph |
|---|---|---|
| Semantic similarity | Excellent | Not applicable |
| Relationship traversal | Cannot do | Excellent |
| Multi-hop reasoning | Difficult | Natural (graph traversal) |
| Global summarization | Poor (limited to retrieved chunks) | Good (community-level summaries) |
| Construction cost | Low (embed documents) | High (extract entities + relationships) |
| Maintenance | Re-embed changed docs | Update entities and relationships |
| Query types | "Find docs about X" | "What is connected to X?" |
Use a knowledge graph when your questions involve relationships ('Who reports to X?'), multi-hop paths ('What teams are affected if Service Y goes down?'), or global summaries ('What are the main themes across all documents?'). If your questions are primarily 'Find information about X', vector search is sufficient and far simpler.