RAG Types

Simple RAG — Retrieval Pipeline

How queries are processed and results returned.

Overview

When a user submits a query, Simple RAG embeds it and searches the vector database for semantically similar chunks. Results are ranked, assembled into context, and optionally passed through an LLM for answer generation.

Pipeline Steps

1. Query Embedding

The user's query is converted to a vector using the same embedding model used during indexing.

2. Vector Search

Cosine similarity search finds the top-K most similar chunks.

3. Result Ranking (Always Active)

Scoring Method	Purpose
Relevance	Cosine similarity score
Diversity	Reduce duplicate information

4. Context Assembly (Always Active)

Combines top chunks into a coherent context window.

5. Response Generation (Optional — Requires LLM Integration)

Activated when: llmEnabled = true in wizard
What it does: Passes assembled context + query to LLM
Output: Natural language answer with citations

← PreviousSimple RAG: Indexing Next →Hybrid RAG