RAG Types

Graph RAG — Indexing Pipeline

How documents are processed with knowledge graph extraction.

Overview

Graph RAG indexing adds knowledge graph extraction alongside vector embeddings, creating a triple-index system: vector DB, (optional) BM25, and Knowledge Graph DB.

Pipeline Steps

1. Document Parser & Preprocessing

Same as Simple/Hybrid RAG: extract text and apply preprocessing settings.

2. Chunking Strategy

Chunks are created for both embedding and graph extraction.

3. Embedding Model (Vector Index)

Converts chunks to dense vectors for semantic search.

4. Graph Extraction (LLM-Powered)

The LLM processes chunks to extract:

Entities: Persons, Organizations, Products, Concepts, Locations
Relationships: Hierarchical, Functional, Causal, Associative

5. Entity Recognition

Identifies and classifies named entities from the text.

6. Relationship Identification

Determines relationships between extracted entities.

7. Knowledge Graph DB

Stores entities and relationships in a graph database (Neo4j).

Key Differences from Simple/Hybrid RAG

Graph Extraction is LLM-powered (consumes graph credits)
Three storage targets: Vector DB + (optional BM25) + Knowledge Graph DB
Entity types: Persons, Organizations, Products, Concepts, Locations
Relationship types: Hierarchical, Functional, Cusal, Associative

← PreviousGraph RAG Next →Graph RAG: Retrieval