Bulk ingest via the REST API

For programmatic ingest — watch-folder daemons, ETL jobs, scanners, anything server-to-server — skip the multi-step presigned-PUT dance and POST the bytes directly:

POST /api/v1/documents Authorization: Bearer k_<prefix>_<secret> Content-Type: <mime type of bytes> X-Kodori-Display-Name: <user-facing name; required, max 512 chars> X-Kodori-Sensitivity: internal | confidential | restricted | regulated | public (optional; default internal) X-Kodori-Collection-Id: <uuid> (optional; pin to this collection on create) X-Kodori-Metadata: <JSON object> (optional; merged into document metadata)

The endpoint hashes the body, dedups against your content-addressable store (an identical blob you uploaded last week pays zero new storage), creates the DocumentObject, and fires the same extraction + auto-classify pipeline a UI upload triggers. Returns 201 with documentId, versionHash, sizeBytes, mimeType, displayName, and a deduped boolean.

Example:

curl -X POST https://kodori.ai/api/v1/documents \ -H "Authorization: Bearer $KODORI_KEY" \ -H "Content-Type: application/pdf" \ -H "X-Kodori-Display-Name: 2024-Q3 BigCo Master Service Agreement.pdf" \ -H "X-Kodori-Sensitivity: confidential" \ -H "X-Kodori-Metadata: {\"keywords\":[\"BigCo\",\"MSA\"],\"docType\":\"contract\"}" \ --data-binary "@./msa.pdf"

Hard cap: 50 MB per request. Files larger than that need the multi-step presigned-PUT flow (browser uploads use this); the route returns 413 if you push past. Most accounting / legal / AEC documents fit comfortably.

Required scope: documents:write. The endpoint never bypasses the standard pipeline — same content-hash identity, same audit-event emission ("document.created"), same extraction + embedding + auto-classify cascade. Webhook subscribers see new uploads exactly as they would for UI ingest.

Related in Integrations and API

Save Excel workbooks to Kodori from the ribbon

Save PowerPoint decks to Kodori from the ribbon

Save Word documents to Kodori from the ribbon