
Upload and organize documents for your RAG system in Step 2 of the wizard.
The Data Sources step is where you upload the documents that power your RAG system. You can upload files directly in the wizard or use the API for programmatic uploads.
| Format | Extension | Max Size | Notes |
|---|---|---|---|
.pdf | 25MB | Text-extractable; OCR available for scanned | |
| Plain Text | .txt | 10MB | UTF-8 encoding recommended |
| Markdown | .md | 10MB | Preserves headers and formatting |
| Word | .docx | 25MB | Extracts text and basic structure |
| Excel | .xlsx, .xls | 25MB | Each sheet processed separately |
| CSV | .csv | 10MB | Rows become individual chunks |
| HTML | .html, .htm | 10MB | Tags stripped, content extracted |
| JSON | .json | 10MB | Must be valid JSON structure |
Scanned PDFs require OCR processing, which increases processing time. Enable OCR in Document Processing step if your PDFs contain images of text.
Upload documents programmatically using the POST /rag/upload endpoint:
curl -X POST "https://api.guidedmind.ai/rag/upload" \
-H "X-API-Key: rk_your_key_here" \
-F "file=@/path/to/document.pdf"Upload and immediately process a document (chunking + embedding) in one request:
curl -X POST "https://api.guidedmind.ai/rag/upload-and-process" \
-H "X-API-Key: rk_your_key_here" \
-F "file=@/path/to/document.pdf" \
-F 'config={"chunking":{"chunk_size":512,"chunk_overlap":50}}'Use upload-and-process when you want the document searchable immediately. Use upload when you want to configure chunking settings first.
from guidedmind import Client
client = Client()
# Upload only
response = client.documents.upload(file_path="document.pdf")
print(f"Uploaded: {response.document_id}")
# Upload and process
response = client.documents.upload_and_process(file_path="document.pdf")
print(f"Processed: {response.chunks_created} chunks created"){
"document_id": "doc_abc123",
"filename": "product-manual.pdf",
"size_bytes": 2456789,
"status": "uploaded",
"uploaded_at": "2026-05-20T00:00:00Z"
}Attach custom metadata to documents for filtering and organization:
curl -X POST "https://api.guidedmind.ai/rag/upload" \
-H "X-API-Key: rk_your_key_here" \
-F "file=@/path/to/document.pdf" \
-F 'metadata={"department":"engineering","version":"2.1","category":"api-docs"}'Metadata is included in search results when include_metadata: true:
{
"content": "The API supports REST and GraphQL endpoints...",
"score": 0.89,
"metadata": {
"department": "engineering",
"version": "2.1",
"category": "api-docs",
"source": "api-reference.pdf"
}
}product-manual-v2.pdf)document.pdfAfter upload, you can:
After uploading your documents, move to Document Processing to configure chunking and text processing options.