
Combine dense (embedding) and sparse (BM25) search for better coverage and precision.
Hybrid RAG combines dense retrieval (embedding similarity) with sparse retrieval (BM25 keyword matching). This gives you the best of both worlds: semantic understanding AND exact keyword matching.
Hybrid RAG handles mixed-domain documents better than Naive RAG since BM25 catches exact matches.
| Setting | Recommended Value |
|---|---|
| Chunking Method | Recursive |
| Chunk Size | 512-768 tokens |
| Overlap | 100 tokens |
| Respect Sentence Boundaries | Enabled |
| Setting | Value |
|---|---|
| Search Method | Hybrid |
| Embedding Model | text-embedding-3-small |
| BM25 Enabled | Yes |
| Dense Weight | 0.6 |
| Sparse Weight | 0.4 |
Adjust the Dense/Sparse weights based on your query patterns. More keyword-heavy queries = increase Sparse weight. More natural language = increase Dense weight.
BM25 parameters can be tuned in the Pipeline tab:
| Parameter | Default | Description |
|---|---|---|
| k1 | 1.5 | Term frequency saturation |
| b | 0.75 | Document length normalization |
Generate your API key and integrate using REST API, Python SDK, or MCP Protocol:
curl -X POST "https://api.guidedmind.ai/rag/search" \
-H "X-API-Key: rk_your_key_here" \
-H "Content-Type: application/json" \
-d '{
"query": "ERROR-402 payment failed",
"options": {
"limit": 5,
"threshold": 0.5,
"search_method": "hybrid"
}
}'| Metric | Naive RAG | Hybrid RAG |
|---|---|---|
| Semantic queries | 85% recall | 88% recall |
| Keyword queries | 45% recall | 92% recall |
| Mixed queries | 65% recall | 90% recall |
| Latency | ~100ms | ~150ms |
| Scenario | Fit |
|---|---|
| Mixed content types | Excellent |
| Product codes / error messages | Excellent |
| Natural language + keywords | Excellent |
| Legal / compliance docs | Good |
| Simple single-domain Q&A | Overkill |
Consider Graph RAG if: