Capture from your phone — photos and voice notes

Open /capture on your phone, take a photo OR record a voice note, and Kodori files what you captured — no app install, no scan-to-email.

Updated 2026-04-27

/capture turns the phone in your pocket into a single-document scanner AND a dictation recorder. It works in any modern mobile browser — no Kodori app, no driver, no print-and-rescan workflow.

**Photos.**

1. Open https://kodori.ai/capture on your phone (or tap "Capture" in the sidebar / mobile rail). 2. Optionally type a short filename prefix — e.g. "Smith matter — courthouse" — so the resulting documents are easy to find later. 3. Tap "Take photo." Your phone's native camera opens (rear-facing); permission prompt the first time. 4. Snap one shot or many. Each appears as a thumbnail tile and starts uploading immediately — by the time you finish capturing the last shot, the earlier ones have usually already filed. 5. When every tile reads "filed", tap the dashboard link at the bottom. The agent will propose filing (sensitivity, collection, keywords, doc type) within ~30 seconds; you'll see proposals on the dashboard.

Works on desktop too. The button becomes a file picker instead of a camera, but the rest of the flow is identical — useful when a paralegal would rather drag pre-existing scans here than into /upload.

**Voice notes (NEW).** Sitting next to the photo tile is a voice-note recorder.

1. Tap "Record voice note." Browser shows the mic-permission prompt the first time. 2. Dictate, narrate a job site, or record a meeting recap. A live duration counter tracks recording time. 3. Tap Stop. Review the duration and tap "File + transcribe" to upload, or "Discard" to throw the recording away. 4. Once filed, the voice note shows up as a Kodori document. OpenAI Whisper transcribes it in the background; the transcript becomes the document's searchable text. Auto-classify produces a 3-sentence aiSummary, a sensitivity label, a collection suggestion, keyword tags, and a docType — same pipeline as a typed memo or scanned PDF.

The audio file IS the document — same SHA-256 dedup, retention class, and audit-log coverage as any other upload. Whisper transcription runs at $0.006/min and counts against the same monthly extract quota that PDFs do (the "pdf.extract" cap covers all expensive AI extractions).

**Install Kodori as an app on your phone (recommended).** When you're on /capture (or any Kodori page), open the Share menu in iOS Safari and tap "Add to Home Screen", or in Android Chrome use the menu's "Install app" option. Kodori then launches as a standalone window — no address bar, full-height capture viewport. Long-pressing the home-screen icon offers a "Capture" shortcut that deep-links straight into the camera flow, which is what you want when you're filing thirty exhibits a day at a courthouse.

Failure modes:

- **Camera permission denied** — iOS / Android show a tiny camera icon in the address bar; tap it to re-grant. On a borrowed device, retap "Take photo" — the OS may have remembered a previous denial. - **Microphone permission denied** — same recipe. The voice-note tile shows an inline message with a Reset button when the OS denies the mic. - **Upload fails on a single tile** — tap the red "Retry" button on the failed tile. The captures that already filed are safe. - **No network** — v1 captures upload on-network only. Offline-first capture (hold shots until reconnect) is on the roadmap.

A note on privacy: photos and voice notes go directly from your device to Kodori's object storage as private bytes — they don't pass through any third-party photo service or analytics provider. Voice notes are sent only to OpenAI Whisper for transcription (subject to OpenAI's enterprise data-handling commitments) and the audio bytes themselves stay in your tenant's R2 bucket. Sensitivity and collection auto-classify run on the same private extraction pipeline as desktop uploads.