
How queries are processed with dual vector + BM25 search.
Hybrid RAG retrieval executes two parallel searches (vector + BM25), then combines the results using a reranking strategy before returning the final ranked list.
The query is embedded using the same model as indexing for vector search.
The query is tokenized for keyword matching against the BM25 inverted index.
| Method | Description |
|---|---|
| RRF (Reciprocal Rank Fusion) | Combines ranks from both searches (recommended) |
| Weighted Sum | Weighted combination of vector and BM25 scores |
Final ranking based on the reranking method output.
Combines top chunks into context, respecting max_context_length.
llmEnabled = true