
Upload and process files for immediate RAG indexing via the SDK.
client.documents gives you programmatic control over document ingestion — useful for
automated pipelines, CI/CD workflows, and batch processing scripts.
The most common operation: upload a file and trigger immediate chunking, embedding, and indexing so it's searchable straight away.
from guidedmind import Client
client = Client()
response = client.documents.upload_and_process(
file_path="docs/technical-spec.pdf",
metadata={"team": "engineering", "version": "2.1", "env": "production"},
)
print(f"Document ID : {response.document_id}")
print(f"Chunks created: {response.chunks_created}")
print(f"Status : {response.status}")Use upload() when you want to stage a file and trigger processing separately
(e.g. during a maintenance window).
upload = client.documents.upload(file_path="handbook.pdf")
print(f"Uploaded: {upload.document_id} — not yet indexed")Process multiple files concurrently using async:
import asyncio
from guidedmind import Client
FILES = [
"docs/api-reference.pdf",
"docs/user-guide.pdf",
"docs/changelog.md",
"docs/faq.txt",
]
async def ingest_all(files: list[str]):
async with Client() as client:
tasks = [
client.documents.aupload_and_process(file_path=f)
for f in files
]
results = await asyncio.gather(*tasks)
for r in results:
print(f" {r.document_id} {r.chunks_created} chunks [{r.status}]")
asyncio.run(ingest_all(FILES))| Format | Notes |
|---|---|
.pdf | Text extraction including tables and structured content |
.docx | Microsoft Word documents |
.md | Markdown — headings preserved as metadata |
.txt | Plain text |
.csv | Each row treated as a separate chunk |
Maximum file size is 50 MB per upload. For larger files, split them before uploading or use the Data Sources connector to stream directly from S3 or GCS.
| Method | Returns | Description |
|---|---|---|
upload(file_path, metadata?) | UploadResponse | Upload without processing |
upload_and_process(file_path, metadata?) | ProcessResponse | Upload, chunk, embed, and index |
aupload(...) | Awaitable[UploadResponse] | Async upload |
aupload_and_process(...) | Awaitable[ProcessResponse] | Async upload and process |
| Parameter | Type | Default | Description |
|---|---|---|---|
file_pathrequired | str | — | Absolute or relative path to the file on disk. |
metadata | dict | None | None | Arbitrary key-value metadata attached to every chunk produced from this document. Useful for filtering or attribution. |