Logo IconGuided Mind
v2.4Sign In
Python SDK

Document Management

Upload and process files for immediate RAG indexing via the SDK.

client.documents gives you programmatic control over document ingestion — useful for automated pipelines, CI/CD workflows, and batch processing scripts.

Upload and Process

The most common operation: upload a file and trigger immediate chunking, embedding, and indexing so it's searchable straight away.

from guidedmind import Client
 
client = Client()
 
response = client.documents.upload_and_process(
    file_path="docs/technical-spec.pdf",
    metadata={"team": "engineering", "version": "2.1", "env": "production"},
)
 
print(f"Document ID   : {response.document_id}")
print(f"Chunks created: {response.chunks_created}")
print(f"Status        : {response.status}")

Upload Without Processing

Use upload() when you want to stage a file and trigger processing separately (e.g. during a maintenance window).

upload = client.documents.upload(file_path="handbook.pdf")
print(f"Uploaded: {upload.document_id} — not yet indexed")

Batch Upload

Process multiple files concurrently using async:

import asyncio
from guidedmind import Client
 
FILES = [
    "docs/api-reference.pdf",
    "docs/user-guide.pdf",
    "docs/changelog.md",
    "docs/faq.txt",
]
 
async def ingest_all(files: list[str]):
    async with Client() as client:
        tasks = [
            client.documents.aupload_and_process(file_path=f)
            for f in files
        ]
        results = await asyncio.gather(*tasks)
 
    for r in results:
        print(f"  {r.document_id}  {r.chunks_created} chunks  [{r.status}]")
 
asyncio.run(ingest_all(FILES))

Supported File Types

FormatNotes
.pdfText extraction including tables and structured content
.docxMicrosoft Word documents
.mdMarkdown — headings preserved as metadata
.txtPlain text
.csvEach row treated as a separate chunk

Maximum file size is 50 MB per upload. For larger files, split them before uploading or use the Data Sources connector to stream directly from S3 or GCS.

Methods

MethodReturnsDescription
upload(file_path, metadata?)UploadResponseUpload without processing
upload_and_process(file_path, metadata?)ProcessResponseUpload, chunk, embed, and index
aupload(...)Awaitable[UploadResponse]Async upload
aupload_and_process(...)Awaitable[ProcessResponse]Async upload and process

Parameters

ParameterTypeDefaultDescription
file_pathrequiredstrAbsolute or relative path to the file on disk.
metadatadict | NoneNoneArbitrary key-value metadata attached to every chunk produced from this document. Useful for filtering or attribution.