Logo IconGuided Mind
v2.4Sign In
RAG Types

Hybrid RAG — Retrieval Pipeline

How queries are processed with dual vector + BM25 search.

Overview

Hybrid RAG retrieval executes two parallel searches (vector + BM25), then combines the results using a reranking strategy before returning the final ranked list.

Pipeline Steps

1. Query Embedding

The query is embedded using the same model as indexing for vector search.

2. BM25 Query Processing

The query is tokenized for keyword matching against the BM25 inverted index.

  • Vector Search: Finds semantically similar chunks
  • BM25 Search: Finds chunks with exact keyword matches

4. Hybrid Reranking

MethodDescription
RRF (Reciprocal Rank Fusion)Combines ranks from both searches (recommended)
Weighted SumWeighted combination of vector and BM25 scores

5. Result Ranking

Final ranking based on the reranking method output.

6. Context Assembly

Combines top chunks into context, respecting max_context_length.

7. Response Generation (Optional)

  • Activated when: llmEnabled = true
  • What it does: Passes assembled context + query to LLM

Key Differences from Simple RAG

  • Two parallel searches: Vector + BM25
  • Hybrid Reranking step: Combines results using RRF or Weighted Sum
  • bm25_weight setting: Controls BM25 influence (0.0-1.0)