Logo IconGuided Mind
v2.4Sign In
RAG Types

Graph RAG — Indexing Pipeline

How documents are processed with knowledge graph extraction.

Overview

Graph RAG indexing adds knowledge graph extraction alongside vector embeddings, creating a triple-index system: vector DB, (optional) BM25, and Knowledge Graph DB.

Pipeline Steps

1. Document Parser & Preprocessing

Same as Simple/Hybrid RAG: extract text and apply preprocessing settings.

2. Chunking Strategy

Chunks are created for both embedding and graph extraction.

3. Embedding Model (Vector Index)

Converts chunks to dense vectors for semantic search.

4. Graph Extraction (LLM-Powered)

The LLM processes chunks to extract:

  • Entities: Persons, Organizations, Products, Concepts, Locations
  • Relationships: Hierarchical, Functional, Causal, Associative

5. Entity Recognition

Identifies and classifies named entities from the text.

6. Relationship Identification

Determines relationships between extracted entities.

7. Knowledge Graph DB

Stores entities and relationships in a graph database (Neo4j).

Key Differences from Simple/Hybrid RAG

  • Graph Extraction is LLM-powered (consumes graph credits)
  • Three storage targets: Vector DB + (optional BM25) + Knowledge Graph DB
  • Entity types: Persons, Organizations, Products, Concepts, Locations
  • Relationship types: Hierarchical, Functional, Cusal, Associative