Logo IconGuided Mind
v2.4Sign In
RAG Wizard

Step 5 — Pipeline Configuration

Configure embeddings, retrieval methods, search methods, and RAG pipeline parameters.

The pipeline configuration step determines how your documents are embedded, searched, and retrieved. This is the most critical step for retrieval quality.

Embedding Settings

Embedding settings control how your document text is converted into numerical vectors. The embedding model transforms text into points in multi-dimensional space, where semantically similar content ends up closer together. Your choice of model directly impacts search accuracy, speed, and cost.

SettingWhat It Does
Embedding ModelSelects the AI model that converts text into vector representations for similarity search
DimensionsVector size (auto-selected based on the chosen model)
Similarity FunctionDetermines how similarity between vectors is calculated

Available Embedding Models

ModelDimensionsContextBest For
all-MiniLM-L6-v2384256 tokensFast prototyping and lightweight use
all-mpnet-base-v2768514 tokensBalanced quality and speed
Jina-Embeddings-v2-Base-EN7688192 tokensLong documents
BGE-Large-EN-v1.51024512 tokensEnterprise-grade performance
E5-Large-v21024512 tokensHigh-quality English retrieval
BGE-M310248192 tokensMultilingual + long context
Stella-EN-1.5B-v51024512 tokensState-of-the-art performance (NovaSearch)

Some embedding models are only available on higher-tier plans. Check your plan details or contact sales for model availability.

Similarity Functions

FunctionDescription
CosineMeasures angle between vectors (recommended for most cases)
EuclideanMeasures straight-line distance between vectors
Dot ProductMeasures alignment between vectors
ManhattanMeasures distance using grid-like paths

Retrieval Method

Retrieval method determines how the chunks found by your search are assembled into the context window sent to the LLM. Different retrieval methods provide varying levels of context richness — from simple raw chunk content to LLM-enhanced summaries that include surrounding document structure and extracted entities.

SettingWhat It Does
Retrieval MethodDetermines how retrieved chunks are assembled into context for the LLM

Custom Document Template

SettingWhat It Does
Document TemplateTemplate string that controls how chunk content and metadata are formatted

Available Tags: {content}, {author}, {title}, {created_date}, {source}, {chunk_index}

Contextual Retrieval

SettingWhat It Does
Contextual Retrieval TemplateTemplate for LLM-enhanced context assembly
LLM ModelSelects the LLM used to enhance chunk context with surrounding document information

Default Template: Context: {full_document}\n\nChunk: {chunk_context} Available Tags: {full_document}, {chunk_context}

ML-Optimized Contextual Retrieval

SettingWhat It Does
ML Contextual Retrieval TemplateTemplate for ML-enhanced context with summaries and entity extraction
LLM ModelOptional LLM for additional context enhancement

Default Template: Document Summary: {full_document_summary}\nSection: {parent_section_summary}\n\nChunk: {chunk_context} Available Tags: {chunk_context}, {full_document_summary}, {parent_section_summary}, {entities}, {topics}, {sentiment}, {key_phrases}

Query Template

Query templates allow you to reformat user queries before they are embedded and searched. This is useful for optimizing how different types of questions are processed — for example, wrapping queries in "Question: ..." format can improve results for Q&A use cases.

SettingWhat It Does
Query TemplateTemplate for processing user queries before searching

Template Presets

PresetTemplateUse Case
Question AnsweringQuestion: {query}\nAnswer:Q&A chatbots
Code SearchFind code related to: {query}Code repositories
Keyword Search{query}Direct keyword matching

Search Method

Search method defines how your system finds relevant chunks when a query is made. Dense (vector) search finds semantically similar content, while sparse (BM25) search matches exact keywords. Hybrid search combines both approaches for the best of both worlds — catching both conceptual matches and exact keyword hits.

SettingWhat It Does
Search MethodSelects the search strategy for finding relevant chunks
Enable BM25Adds keyword-based BM25 search alongside vector search
BM25 WeightControls how much BM25 scores influence final rankings (0-1)
Hybrid Rerank MethodDetermines how vector and BM25 results are combined

Search Methods

MethodDescription
DensePure vector (embedding) search
SparseBM25 keyword search only
HybridVector + BM25 with reranking
GraphVector + knowledge graph traversal

Reranking Methods

MethodDescription
RRFReciprocal Rank Fusion (recommended)
Reciprocal Rank FusionFull RRF with configurable window
Weighted SumWeighted combination of scores

RAG Pipeline Settings

Pipeline settings control the final stages of query processing — how many results are returned, what quality threshold they must meet, and how the retrieved context is assembled before being sent to the LLM. These settings balance result quantity against quality and control whether your system returns raw search results or generates natural language answers.

SettingWhat It Does
Top KNumber of results to return per query
Score ThresholdMinimum similarity score — results below this are filtered out
Context AssemblyStrategy for ordering and combining retrieved chunks
Max Context LengthMaximum number of tokens in the assembled context
LLM IntegrationEnables LLM-powered answer generation (requires LLM model integration)
LLM ModelSelects the LLM model for generating answers
TemperatureControls LLM creativity (0 = deterministic, 1 = more creative)

Context Assembly Methods

MethodDescription
RankedAssembles chunks by relevance score (recommended)
SequentialAssembles chunks in original document order
WeightedCombines relevance score and document position

Start with defaults and adjust top_k and score_threshold based on your evaluation results. Higher thresholds reduce noise but may miss relevant chunks.