
Verify your embedding model produces good similarity scores before deployment
Before deploying your RAG system, verify that your chosen embedding model produces good similarity scores for your typical queries. This testing step is critical because:
Entry Point: Document Processing tab → "Test Embedding Search" button
Prerequisites: Documents must be processed (Step 1 complete)
Expected Outcome: Confirmed similarity scores are acceptable (0.7+)
Navigate to your RAG project and find the Document Processing tab. Click the "Test Embedding Search" button to open the testing interface.
The Embedding Search interface includes:
User enters query → System converts to vector → Searches chunks → Returns similarity scores
Similarity scores (0.0 to 1.0) indicate how well each chunk matches your query:
| Score Range | Quality | Action Required |
|---|---|---|
| 0.8 - 1.0 | Excellent | Ready to proceed |
| 0.7 - 0.8 | Good | Acceptable for most use cases |
| 0.5 - 0.7 | Fair | Consider different embedding model |
| Below 0.5 | Poor | Change embedding model required |
Score: 0.8 - 1.0 (Excellent)
The chunk is highly relevant to your query. API responses using this configuration will return accurate, on-topic results.
Score: 0.7 - 0.8 (Good)
The chunk is moderately relevant. Acceptable for most production use cases.
Score: 0.5 - 0.7 (Fair)
The chunk has some relevance but may not be what users expect. Consider adjusting your embedding model or chunking settings.
Score: Below 0.5 (Poor)
The chunk is not relevant. Your RAG system will return poor quality responses with this configuration. Change embedding model immediately.
Create 5-10 representative queries your users will ask:
| Query Type | Description | Example |
|---|---|---|
| Simple Factual | Direct question with single answer | "What is the return policy?" |
| Multi-Part | Question with multiple components | "What are the return policy and refund timeline?" |
| Domain-Specific | Uses industry terminology | "What is the SLA for enterprise tier?" |
| Edge Case | Unusual or boundary query | "Can I return opened software?" |
Tips for Test Queries:
For each query:
Ask yourself these questions:
Relevance Check:
Score Distribution:
Coverage Check:
Query: "What is the return policy for electronics?"
Results:
─────────────────────────────────────────────────
1. [Score: 0.89] "Electronics returns accepted within 30 days..."
Source: policy.pdf, Chunk 3
2. [Score: 0.85] "Return policy overview: All products..."
Source: policy.pdf, Chunk 1
3. [Score: 0.82] "Electronics category specific rules..."
Source: electronics-faq.md, Chunk 2
4. [Score: 0.78] "Refund processing timeline..."
Source: policy.pdf, Chunk 5
5. [Score: 0.75] "Exception items: Software, DVDs..."
Source: returns.md, Chunk 4
Assessment: ✓ PASS
- All top 5 results are relevant
- Average score: 0.82 (Excellent)
- Key documents appearing in results
Option 1: Try a Larger Embedding Model
| Current Model | Upgrade To | Expected Improvement |
|---|---|---|
| all-MiniLM-L6-v2 (384D) | text-embedding-3-small (1536D) | +0.1-0.15 scores |
| text-embedding-3-small (1536D) | text-embedding-3-large (3072D) | +0.05-0.1 scores |
Expected API Improvement:
Option 2: Adjust Chunk Size
| Chunk Size | Effect | Recommendation |
|---|---|---|
| Too small (< 256 tokens) | May lose context | Increase to 512 |
| Too large (> 1024 tokens) | Diluted embeddings | Decrease to 768 |
| Sweet spot (512-768 tokens) | Balanced | Use for most cases |
Option 3: Enable BM25 Hybrid Search
Enable BM25 when:
Benefits:
Problem: Some queries get good scores, others get poor scores
Solutions:
The similarity scores you see in Embedding Search directly translate to API responses:
{
"query": "What is the return policy?",
"results": [
{
"content": "Returns accepted within 30 days...",
"similarity_score": 0.87,
"source": "policy.pdf"
},
{
"content": "Return policy overview...",
"similarity_score": 0.82,
"source": "policy.pdf"
}
]
}Key Points:
similarity_score in API matches Embedding Search scoresProceed to Pipeline Configuration when:
Stay in Step 2 and iterate when:
| Issue | Possible Cause | Solution |
|---|---|---|
| All scores below 0.5 | Wrong embedding model | Upgrade to larger model |
| Irrelevant top results | Chunk size too large | Reduce to 512 tokens |
| Missing key documents | Document not processed | Check processing status |
| Inconsistent scores | Mixed content types | Enable BM25 hybrid |
| Good scores but wrong answers | Chunk too small | Increase chunk size |
Once similarity scores are acceptable, proceed to Step 3: Configure RAG Pipeline to test how embedding settings work together with retrieval configuration.