Python SDK

Document Management

Upload and process files for immediate RAG indexing via the SDK.

client.documents gives you programmatic control over document ingestion — useful for automated pipelines, CI/CD workflows, and batch processing scripts.

Upload and Process

The most common operation: upload a file and trigger immediate chunking, embedding, and indexing so it's searchable straight away.

from guidedmind import Client
 
client = Client()
 
response = client.documents.upload_and_process(
    file_path="docs/technical-spec.pdf",
    metadata={"team": "engineering", "version": "2.1", "env": "production"},
)
 
print(f"Document ID   : {response.document_id}")
print(f"Chunks created: {response.chunks_created}")
print(f"Status        : {response.status}")

Upload Without Processing

Use upload() when you want to stage a file and trigger processing separately (e.g. during a maintenance window).

upload = client.documents.upload(file_path="handbook.pdf")
print(f"Uploaded: {upload.document_id} — not yet indexed")

Batch Upload

Process multiple files concurrently using async:

import asyncio
from guidedmind import Client
 
FILES = [
    "docs/api-reference.pdf",
    "docs/user-guide.pdf",
    "docs/changelog.md",
    "docs/faq.txt",
]
 
async def ingest_all(files: list[str]):
    async with Client() as client:
        tasks = [
            client.documents.aupload_and_process(file_path=f)
            for f in files
        ]
        results = await asyncio.gather(*tasks)
 
    for r in results:
        print(f"  {r.document_id}  {r.chunks_created} chunks  [{r.status}]")
 
asyncio.run(ingest_all(FILES))

Supported File Types

Format	Notes
`.pdf`	Text extraction including tables and structured content
`.docx`	Microsoft Word documents
`.md`	Markdown — headings preserved as metadata
`.txt`	Plain text
`.csv`	Each row treated as a separate chunk

Maximum file size is 50 MB per upload. For larger files, split them before uploading or use the Data Sources connector to stream directly from S3 or GCS.

Methods

Method	Returns	Description
`upload(file_path, metadata?)`	`UploadResponse`	Upload without processing
`upload_and_process(file_path, metadata?)`	`ProcessResponse`	Upload, chunk, embed, and index
`aupload(...)`	`Awaitable[UploadResponse]`	Async upload
`aupload_and_process(...)`	`Awaitable[ProcessResponse]`	Async upload and process

Parameters

Parameter	Type	Default	Description
`file_path`required	`str`	—	Absolute or relative path to the file on disk.
`metadata`	`dict \| None`	`None`	Arbitrary key-value metadata attached to every chunk produced from this document. Useful for filtering or attribution.

← PreviousLong Memory Next →Async & Context Manager