
Complete guide to creating a RAG system using the RAG Wizard
The RAG Wizard guides you through a structured 5-step process to create a complete Retrieval-Augmented Generation (RAG) system. This wizard transforms the complex setup of document processing pipelines, embedding configurations, and retrieval mechanisms into an intuitive, guided workflow.
Entry Point: Dashboard → RAG → Create New
Expected Outcome: Fully configured RAG system ready for testing
Estimated Time: 15-30 minutes depending on document count
| Step | Title | What You Do | What Gets Created |
|---|---|---|---|
| 1 | Project Setup | Define use case, scale | Project configuration |
| 2 | Data Sources | Upload documents | Document library |
| 3 | Document Processing | Configure chunking | Processed chunks |
| 4 | Pipeline Configuration | Select embedding, retrieval | Search pipeline |
| 5 | API Setup | Generate API key | Ready-to-use endpoint |
Choose a descriptive name that clearly identifies your RAG system's purpose.
Example: "Customer Support Knowledge Base"
Tips:
Select the primary domain that best describes your use case:
| Domain | Best For |
|---|---|
| Customer Support | FAQ systems, help desk automation |
| Research & Academia | Literature reviews, citation assistance |
| Content Creation | Writing assistance, content generation |
| Technical Documentation | API documentation, code explanations |
| Business Intelligence | Data analysis, report generation |
| Education | Tutoring systems, course assistance |
| Legal & Compliance | Document analysis, regulation compliance |
| Healthcare | Medical knowledge bases, patient information |
Your domain selection helps optimize default settings for your use case.
Define the specific use case within your chosen domain:
Examples:
Configure your system for the anticipated load:
| Scale | Queries/Day | Configuration |
|---|---|---|
| Small | < 1,000 | Development/testing optimized |
| Medium | 1,000 - 10,000 | Production-ready |
| Large | > 10,000 | Enterprise-grade |
Define the typical complexity of user queries:
| Complexity | Description | Example |
|---|---|---|
| Simple | Direct fact retrieval | "What is the return policy?" |
| Moderate | Multi-step reasoning | "How does pricing compare for enterprise?" |
| Complex | Advanced analysis | "Analyze Q3 policy impact on satisfaction" |
Select the primary type of responses your system will generate:
| Response Type | Description |
|---|---|
| Factual Answers | Direct, concise responses |
| Explanatory Responses | Detailed explanations with context |
| Analytical Insights | Data interpretation and analysis |
| Creative Content | Content generation and writing |
| Format | Extensions | Description |
|---|---|---|
| Text Documents | .txt, .md, .rtf | Plain text and markdown |
| PDF Documents | .pdf | Extractable text and scanned (OCR) |
| Office Documents | .docx, .xlsx, .pptx | Microsoft Office formats |
| Structured Data | .csv, .json, .xml | Tabular and structured data |
| Web Content | .html, .htm | HTML documents |
Drag & Drop:
File Browser:
Upon upload, documents undergo automatic processing:
Chunking divides large documents into smaller, semantically meaningful pieces:
| Strategy | Best For | Description |
|---|---|---|
| Fixed-Size | Consistent processing | Divides text into predetermined sizes |
| Semantic | Topic coherence | Divides based on semantic boundaries |
| Recursive | Complex structures | Hierarchical chunking at multiple levels |
| Document-Based | Short documents | Treats entire documents as chunks |
Chunk Size:
Overlap Percentage:
| Model | Dimensions | Context Length | Best For |
|---|---|---|---|
| Stella-EN-1.5B-v5 | 1024D | 512 tokens | Excellent performance (71.19), 1.5B parameters, state-of-the-art |
| BGE-Large-EN-v1.5 | 1024D | 512 tokens | Solid performance (~64), enterprise-grade, English optimized |
| E5-Large-v2 | 1024D | 512 tokens | Good performance (~64), multilingual support, versatile |
| All-MPNet-Base-v2 | 768D | 384 tokens | General-purpose sentence embeddings, semantic search |
| BGE-M3 | 1024D | 8192 tokens | Multilingual support, 8K context, dense + sparse retrieval |
| Jina-Embeddings-v2-Base-EN | 768D | 8192 tokens | 8K context support, balanced performance, long documents |
| All-MiniLM-L6-v2 | 384D | 256 tokens | Optimized for speed, lightweight, quick processing, prototyping |
Best Practices for Model Selection:
Stella-EN-1.5B-v5 or BGE-Large-EN-v1.5 for best overall performanceBGE-M3 or Jina-Embeddings-v2-Base-EN with 8K context supportAll-MiniLM-L6-v2 for fast iteration, then upgradeE5-Large-v2 or BGE-M3 provide excellent multilingual support| Method | Use Case | Description |
|---|---|---|
| Cosine Similarity | Most common | Normalizes for vector magnitude |
| Euclidean Distance | Absolute magnitude | Intuitive distance measurement |
| Dot Product | Performance-critical | Fastest computation |
| Manhattan Distance | Noisy data | Robust to outliers |
| Method | Description | Best For |
|---|---|---|
| Custom Template | Simple template-based | Straightforward Q&A |
| Contextual Retrieval | LLM-enhanced context | Complex narratives |
| ML-Optimized | Multi-level summaries | Hierarchical content |
Enable when:
Benefits:
Primary API Key:
Key Security:
Bearer Token (Recommended):
curl -X POST "https://api.guidedmind.ai/v1/rag/your-project-id/query" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"query": "What is the return policy?"}'Header-based:
curl -X POST "https://api.guidedmind.ai/v1/rag/your-project-id/query" \
-H "X-API-Key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"query": "What is the return policy?"}'| Tier | Requests/Day | Requests/Minute |
|---|---|---|
| Development | 1,000 | 10 |
| Production | 100,000 | 500 |
| Enterprise | Custom | Custom |
If you're setting up GraphRAG, additional configuration is available:
Graph Schema:
NER Analysis:
| Setting | Description | Recommended |
|---|---|---|
| Max Nodes | Control graph size | 100-200 for testing |
| Extraction Scenario | Domain-specific prompts | Business or Technical |
At each step, the wizard validates:
The wizard is complete when:
After completing the wizard, proceed to Step 2: Test Embedding Search to verify your embedding model produces good similarity scores before deployment.
Before proceeding to Step 2: