Logo IconGuided Mind
v2.4Sign In

Data Sources

Upload and organize documents for your RAG system in Step 2 of the wizard.

The Data Sources step is where you upload the documents that power your RAG system. You can upload files directly in the wizard or use the API for programmatic uploads.

Supported File Formats

FormatExtensionMax SizeNotes
PDF.pdf25MBText-extractable; OCR available for scanned
Plain Text.txt10MBUTF-8 encoding recommended
Markdown.md10MBPreserves headers and formatting
Word.docx25MBExtracts text and basic structure
Excel.xlsx, .xls25MBEach sheet processed separately
CSV.csv10MBRows become individual chunks
HTML.html, .htm10MBTags stripped, content extracted
JSON.json10MBMust be valid JSON structure

Scanned PDFs require OCR processing, which increases processing time. Enable OCR in Document Processing step if your PDFs contain images of text.

Upload Methods

Dashboard Upload (Wizard)

  1. Navigate to Step 2: Data Sources in the wizard
  2. Drag and drop files or click to browse
  3. Select multiple files or upload one at a time
  4. Click Upload to start processing

API Upload

Upload documents programmatically using the POST /rag/upload endpoint:

curl -X POST "https://api.guidedmind.ai/rag/upload" \
  -H "X-API-Key: rk_your_key_here" \
  -F "file=@/path/to/document.pdf"

Upload & Process

Upload and immediately process a document (chunking + embedding) in one request:

curl -X POST "https://api.guidedmind.ai/rag/upload-and-process" \
  -H "X-API-Key: rk_your_key_here" \
  -F "file=@/path/to/document.pdf" \
  -F 'config={"chunking":{"chunk_size":512,"chunk_overlap":50}}'

Use upload-and-process when you want the document searchable immediately. Use upload when you want to configure chunking settings first.

Python SDK

from guidedmind import Client
 
client = Client()
 
# Upload only
response = client.documents.upload(file_path="document.pdf")
print(f"Uploaded: {response.document_id}")
 
# Upload and process
response = client.documents.upload_and_process(file_path="document.pdf")
print(f"Processed: {response.chunks_created} chunks created")

Upload Response

{
  "document_id": "doc_abc123",
  "filename": "product-manual.pdf",
  "size_bytes": 2456789,
  "status": "uploaded",
  "uploaded_at": "2026-05-20T00:00:00Z"
}

Document Metadata

Attach custom metadata to documents for filtering and organization:

curl -X POST "https://api.guidedmind.ai/rag/upload" \
  -H "X-API-Key: rk_your_key_here" \
  -F "file=@/path/to/document.pdf" \
  -F 'metadata={"department":"engineering","version":"2.1","category":"api-docs"}'

Metadata is included in search results when include_metadata: true:

{
  "content": "The API supports REST and GraphQL endpoints...",
  "score": 0.89,
  "metadata": {
    "department": "engineering",
    "version": "2.1",
    "category": "api-docs",
    "source": "api-reference.pdf"
  }
}

File Preparation Tips

Do
  • Use text-extractable PDFs (not scanned images)
  • Name files descriptively (e.g., product-manual-v2.pdf)
  • Remove password protection before uploading
  • Split very large documents into logical sections
Don't
  • Upload corrupted or password-protected files
  • Mix unrelated content in one document
  • Upload duplicates without checking first
  • Use generic names like document.pdf

Managing Documents

After upload, you can:

  • View document status and processing progress
  • Delete documents that are no longer needed
  • Re-process with different chunking settings
  • Check metadata attached to each document

Next Step

After uploading your documents, move to Document Processing to configure chunking and text processing options.