Changelog · 254 releases
Every release.
What shipped, what improved, what’s on deck. We update this page in lockstep with the marketing surface and the help center — if it’s here, it’s real.
v0.7.120Scheduled deletion paired with manual delete in History tab (D336)- Improved
Both deletion paths now live on the History tab
D335 had Scheduled deletion under Holds and Danger zone (manual delete) under History — splitting two paths of the same lifecycle-end action. Now both sit together on the History tab. Operators thinking "I want to get rid of this doc" find both options in one place, with the audit log right above so they see the full event history before pulling the trigger. Holds reverts to being purely "what restricts this doc" — the opposite posture from triggering its end.
v0.7.119Tab regrouping on /doc/[id] right pane — Notes promoted to its own tab; sections grouped by mental model (D335)- Shipped
New "Notes" tab — annotation threads get their own slot
Annotations / notes are conversational and high-engagement (operators read + write often), so promoting them to a top-level tab makes them faster to reach than mixing them into Overview. Tab strip order is now: Overview / Content / Notes / Access / Versions / Holds / History — Notes positioned 3rd because notes annotate the content. URL state already handles deep-links: /doc/[id]?tab=notes works the same way as every other tab.
- Improved
Sections regrouped to match operator mental model
After D334's 6-tab cut left some sections under tabs that didn't match their semantics, swept the page: Citations + Drawings moved from History → Content (they're extracted FROM the content); Collections moved from Versions → Overview ("where does this doc live" is identity info); Template marker moved from History → Overview (lifecycle role marker, belongs with the identity card); Scheduled deletion moved from History → Holds (future retention action, pairs with active legal holds as lifecycle gates). Danger zone (delete) stays in History — irreversible action, lives with the audit trail.
v0.7.118Right pane on /doc/[id] is now real tabs (Overview / Content / Access / Versions / Holds / History) — D332 anchor nav was still too long (D334)- Shipped
Real tabs on the doc detail right pane — non-active sections hidden
Click a tab, see only that section. Other sections aren't there until you click their tab. Tab state lives in the URL (`?tab=holds` etc.) so you can deep-link straight to the Holds tab on a doc, share that link with a teammate, and browser back/forward works the way you'd expect. The "is one click away" affordance is the tab strip itself — operators don't have to remember a section exists, the strip enumerates them.
- Improved
Pivoted from D332's sticky anchor nav after Sam tried it live
D332 had argued AGAINST tabs on the basis that hiding action-bearing sections risks operators missing them. After living with the anchor-nav for a session, the page was still "forever long" — the trade flipped: shortening the right pane wins, and the always-visible tab strip mitigates the original risk. The anchor-jump component is gone; section IDs preserved as DOM anchors for any future deep-link work that wants them.
v0.7.117Bulk-apply retention class to a collection — with optional content-type filter (D333, closes Roy task #166)- Shipped
Apply retention class to every doc in a collection
New admin section on /collections/[id]: pick a retention class, optionally narrow to specific content types (PDFs only, Word docs only, etc.), click Apply. Backfills the chosen class onto every readable member of the collection — overrides any existing per-doc class. Each affected document emits its own audit event (one event per doc, hash-chained as usual), and every override is reversible from the document's history. Documents on legal hold are unaffected.
- Improved
Bulk MCP tools now accept an optional MIME filter on collection sources
bulkAddDocumentsToCollection / bulkSetDocumentRetentionClass / bulkSetDocumentSensitivity all accept `source.mimeTypeIn[]` when `source.kind === "collection"`. Lets the agent (or operators in the new UI) say "every PDF in this matter" without having to assemble a saved search first. Five new eval fixtures pin the schema (208 tests pass).
v0.7.116Sticky section nav on /doc/[id] right pane — jump-to-section without hiding content (D332)- Shipped
Six-jump anchor nav at the top of the right pane
Overview / Content / Access / Versions / Holds / History. Sticky during scroll; IntersectionObserver highlights the currently-visible section. Click-jumps land at the section heading (with scroll-mt offset so the sticky nav doesn't cover it). Pivoted from D311.2's original tab plan because real tabs hide content behind clicks — every section here has actions (forms, buttons), so hiding any of them risks operators missing things. Anchor nav keeps everything visible while still solving the "this page is too long to scan" pain.
v0.7.115User groups — named buckets of users for batched permission grants (D331)- Shipped
New /groups page — create + manage groups
Owner/admin only. Groups are tenant-scoped buckets of users; the most common use cases are "litigation team", "outside counsel", "AP clerks", or any other recurring access pattern. /groups index shows every live group with member counts; /groups/[id] is the detail page with rename + member-list + soft-delete affordances. Add members from a dropdown of tenant users not yet in the group.
- Shipped
Share-with-group on /doc/[id]
New form on the Access panel: "Share with group" dropdown next to the existing email-share form. Every member of the group inherits read access transitively via the canReadDocument resolver's new group-membership branch. Group grants render in their own row on the Access panel with chips + a single Revoke-all button, parallel to user grants.
- Shipped
Permission resolver honors group membership
canReadDocument now matches `principal_kind='user'+actor` OR `principal_kind='group'+caller's-group-ids` at every EXISTS subquery (doc-level deny, collection-level deny, doc-level allow, collection-level allow). Soft-deleted groups and cross-tenant groups are excluded. Deny-wins still applies — a user-level deny beats any group-level allow.
- Shipped
8 new MCP tools, 5 new event types, 7 new eval tests
createGroup / renameGroup / deleteGroup / addGroupMember / removeGroupMember / listGroups / listGroupMembers / grantGroupPermission. All audit-logged. Eval count 196 → 203 — static-source-grep tests pin the resolver's architectural invariants so a future refactor can't accidentally drop the group-membership branch.
v0.7.114DocAI provisioning loop closed — final root cause was a trailing newline on GOOGLE_DOCAI_PROJECT_ID (D330)- Fixed
Stray `\n` on the project ID env var was masquerading as PERMISSION_DENIED
After D329's serviceusage IAM fix and the raster-PDF-wrap fix, all DocAI calls still failed with "Permission denied on resource project kodori-prod". Direct API test from local gcloud as the SA confirmed auth + perms were correct, so the issue had to be in production. A temporary diag endpoint at /api/admin/docai-diag (owner/admin only, no secret leakage) revealed that GOOGLE_DOCAI_PROJECT_ID was `kodori-prod\n` — a trailing newline that Vercel had captured when the value was pasted with Enter at the end. The SDK constructed API URLs as `projects/kodori-prod\n/locations/us/...` which Google rejected. After re-saving as exactly `kodori-prod`, DocAI calls succeeded.
- Shipped
13,953 of 14,021 documents extracted (99.84%)
After the long DocAI provisioning arc, the corpus is fully extracted save for ~70 docs that fall into three buckets: (1) 23 multi-page TIFFs/JPEGs that DocAI silently stalls on, likely needing batchProcess async API — left for a future shipment; (2) 16 old failures from 2026-04-25 in a different tenant that Re-run for all doesn't reach from the workspace where the click happened; (3) ~26 genuinely-unsupported MIMEs (.one, video, postscript) that operators can Won't-fix via D327 to suppress them.
v0.7.113DocAI serviceusage IAM fix + raster-convert always wraps as PDF (D329)- Fixed
Permission denied on every DocAI call — `serviceusage.services.use` was missing
After D328 went live, every DocAI call still failed with `PERMISSION_DENIED: Permission denied on resource project kodori-prod`. Root cause: the project-level error is a `serviceusage.services.use` permission check, which neither `documentai.apiUser` nor `documentai.editor` includes. Added `roles/serviceusage.serviceUsageConsumer` to the kodori-docai SA via gcloud — no code change, applies in-flight.
- Fixed
raster-convert single-page output busted Claude's 5 MB image-block cap
The single-page raster path sent PNG via Claude's image content block (5 MB cap), but a downscaled 1568px PNG of a high-detail legal scan can land at 6+ MB. Fix: removed the single-page-as-PNG branch entirely. Everything now goes through the PDF-wrap path which uses Claude's document content block (32 MB cap — ~6× more headroom). Costs one extra encoding step but eliminates the 5 MB cliff.
v0.7.112Size-aware extractor cascade + Google DocAI provisioned in production (D328)- Shipped
Google Document AI now extracts PDFs + rasters at $1.50 per 1,000 pages
Operator-provisioned the kodori-prod Google Cloud project + Document AI processor + service account. Wired GOOGLE_DOCAI_PROJECT_ID + GOOGLE_DOCAI_PROCESSOR_ID + GOOGLE_APPLICATION_CREDENTIALS_JSON into Vercel. The extractor cascade now routes PDFs and rasters through DocAI first (cheap, accurate OCR) and falls back to Claude vision only when DocAI declines. Estimated 3-30× cost reduction on extraction depending on which Claude model was being compared against.
- Shipped
Size-aware extractor cascade (D328)
DocumentExtractor.supports() now accepts an optional sizeBytes parameter so an extractor can gracefully decline files outside its operational range. google-docai now declines files >20 MB (its sync API cap) so the cascade falls through to raster-convert, which downscales to 1568 px and routes through Claude vision. Without this, the 20 stuck >20 MB TIFFs in Roy's queue had no fallback path — DocAI threw, no other extractor was tried, doc terminally failed.
- Improved
raster-convert input cap bumped from 10 MB → 30 MB
D325 dropped the cap to 10 MB to bound memory before the limitInputPixels=64MP guard was in place. Now that the pixel-count guard caps decoded buffer at ~256 MB regardless of input file size, we can accept the typical 26 MB legal-scan TIFF without OOM. Combined with concurrency=3 per tenant + 1568 px output cap (D324), peak memory stays under 1 GB Lambda.
v0.7.111Acknowledge-failed-extraction + better error visibility (D327)- Shipped
"Won't fix" button on /extraction-issues — skip known-bad docs on future re-runs
For docs that fail extraction for a stable reason (content-policy refusal, corrupt source, format we don't plan to support), re-running burns API cost with no chance of success. New per-row "Won't fix" button on the Unsupported and Failed tabs flags the doc as operator-acknowledged; the "Re-run for all" workflow now excludes dismissed rows. Owner / admin only. Auto-fills the dismissal reason from the existing error message so the audit log captures why.
- Shipped
New "Dismissed" tab
Fourth tab in the /extraction-issues nav. Lists every doc you've marked Won't-fix with the reason. Per-row "Re-enable" button puts the doc back into the re-run set (e.g. when a new extractor lands that handles the format).
- Improved
Full error messages on hover
The Detail column was truncated at 80 chars with no way to see the rest. Now hovers show the full error via title attribute (and the truncated cell shows a "…" so it's obvious there's more).
- Improved
Tab blurbs explain re-run cost
Re-running an Unsupported doc is FREE — it short-circuits at the mark-unsupported step before any extractor call. Re-running a Failed doc CAN cost money since the extractor (Claude vision, etc.) gets re-invoked. The Failed-tab blurb now points at "Won't fix" as the cost-control affordance for stable failures.
v0.7.110Real fix for stuck extractions — collapse fetch-blob + run-extractor into one Inngest step (D326)- Fixed
Inngest 4 MB step output cap was silently killing every blob extraction
D323-D325 defended against an OOM that wasn't the actual bottleneck. The real problem: the `fetch-blob` step returned `Array.from(bytes)` so the next step could re-instantiate the Uint8Array. Inngest JSON-serializes step outputs and caps them at 4 MB; a 3 MB blob → `[123,45,67,...]` → ~10-20 MB of text → silently truncated or rejected. The workflow stalled between fetch-blob and run-extractor, every doc landed at status='running' forever (no success event, no fail event, no cost event — just stuck). Diag query against live DB confirmed: 0 cost events in 30 min, 0 succeeded events in 6 hours, only 8 failed. Fix: collapse fetch-blob + run-extractor into one `fetch-and-extract` step. Bytes never leave function memory; trade-off (re-fetch on retry) is cheap (R2 GET) vs the alternative (every blob >2 MB silently kills the workflow).
v0.7.109Make raster extraction actually fit in 1 GB Lambda memory (D325)- Fixed
Per-tenant extraction concurrency dropped from 10 → 3
A diag query against the stuck queue revealed 1358 of 1407 stuck docs were image/tiff (avg 3 MB, max 26 MB). Sharp has to decode the FULL source TIFF into memory BEFORE applying the resize — 10 parallel extractions × ~300 MB peak each = ~5 GB aggregate, which blew past Vercel's 1 GB Lambda memory cap. Three parallel extractions × ~300 MB = ~900 MB, with headroom for Lambda overhead.
- Fixed
Raster MAX_INPUT_BYTES dropped from 20 MB → 10 MB
Files larger than 10 MB now land at status='unsupported' with a clear "configure Azure or Google DocAI for larger raster files" error instead of OOM-killing the Lambda worker. The 26 MB TIFF in the stuck queue routes here cleanly.
- Fixed
Sharp memory tuning — cache(false), concurrency(1), 64 MP input cap
Disabled libvips's internal 50 MB page cache (pure overhead for one-shot decode) and forced single-threaded decode so a multi-page TIFF doesn't spawn N parallel decoders each with their own pixel buffer. limitInputPixels = 64 MP (8000×8000) refuses anything larger BEFORE any pixel buffer allocation — catches a maliciously- or accidentally-huge TIFF marker that would otherwise allocate >3 GB and instantly OOM.
v0.7.108Audit-page join cast fix + tighter raster cap + vercel.json at monorepo root (D324)- Fixed
/audit page no longer crashes on non-UUID actor IDs
The leftJoin to users used `eq(users.id, sql`${actorId}::uuid`)`, but events.actorId is a text column that holds non-UUID actors — `system:dlp-scanner` for system events, `apikey:abc12345` for agents. Postgres errored with "invalid input syntax for type uuid" the moment any non-UUID actorId was in the result set. Same class of bug as D306. Fix: text-cast users.id instead, gated by actorKind = 'user'. Non-UUID actors get NULL email column (which is what we want).
- Fixed
Raster long-edge cap tightened from 2048 → 1568 px
2048 was still letting some multi-page legal scans OOM the Lambda. 1568 is Claude vision's documented sweet spot for OCR — anything larger costs tokens without accuracy gain. Pixel buffer drops from ~16 MB to ~9.6 MB per page worst case, which keeps memory budget under 1 GB even if the D323 vercel.json memory bump didn't take effect.
- Fixed
vercel.json moved to monorepo root with both path globs
D323 placed vercel.json at apps/web/vercel.json but the build log gave no signal that Vercel detected it. Now also at the monorepo root with both `apps/web/app/api/inngest/route.ts` and `app/api/inngest/route.ts` globs so the memory + maxDuration config works regardless of which rootDirectory Vercel resolved.
v0.7.107Stop the extraction OOM bleed — raster downscale + Vercel memory bump + content-filter graceful failure (D323)- Fixed
Raster-convert now downscales TIFF / HEIC to 2048px max long-edge before PNG encoding
Sharp was decoding each TIFF page into a full-resolution pixel buffer — an 8000×10000 4-channel image = ~320 MB per page; multi-page legal scans blew past 1 GB and the Vercel Lambda OOM-killed the extraction. Claude vision's documented sweet spot is ~1568px on the long edge, so going bigger cost tokens with no accuracy gain. Applied to both single-page and multi-page paths; decoded buffer now ~16 MB per page worst case.
- Fixed
Vercel function memory for /api/inngest bumped from 1024 MB → 3008 MB
The default 1024 MB was the OOM trigger before D323's downscale. Even with the smaller buffers, claude-pdf on multi-page PDFs and office-pptx on image-heavy decks need headroom. Configured via apps/web/vercel.json `functions` field; `export const maxDuration = 300` in the route file pins the budget against future Vercel default changes.
- Fixed
Anthropic "Output blocked by content filter" lands at status='failed' instead of looping forever
Anthropic refuses output for sensitive content (some criminal records, medical imagery, etc.) — retrying never succeeds because the same bytes hit the same filter. Now we catch AI_APICallError messages containing "Output blocked by content filter" or "content policy" and return a structured non-throwing terminal result (extractor='claude-pdf-blocked', structured.blocked.reason='anthropic-content-filter'). Inngest stops retrying, the doc lands at status='failed' with a clear reason the operator can act on (re-upload as text, redact and re-upload, or accept the gap).
- Improved
Click "Re-run for all" after this lands
The OOM'd documents will retry through the smaller decoded buffers and complete; the content-filtered ones will resolve to failed immediately instead of staying in running.
v0.7.106Filename-based MIME inference fixes .msg / .doc / .zip uploads + backfills existing rows (D322)- Fixed
Browsers don't label .msg / .doc / .zip — Kodori now does
Browser File objects return empty `File.type` for formats without a registered MIME (.msg, .doc, .zip, .heic, .tiff). The dropzone fell back to `application/octet-stream`, which the extractor registry has no route for, so files landed at `status='unsupported'` even after D317 shipped the outlook-msg / office-doc / zip-archive extractors. New `resolveMimeType(browserMime, filename)` helper prefers the browser MIME but falls back to extension-based inference when the browser sends nothing useful. Wired into both upload paths (bulk dropzone + new-version) plus server-side as defense-in-depth.
- Fixed
Backfill — existing octet-stream uploads corrected
Migration 0104 updates rows whose `display_name` ends in `.msg` / `.doc` / `.zip` / `.heic` / `.heif` / `.tiff` / `.tif` and whose `mime_type` is `application/octet-stream` to the right MIME. Idempotent — only matches rows that still have octet-stream. After this lands, click "Re-run for all" on the dashboard: the previously-unsupported docs will route to their D317 extractors and land at `status='succeeded'`.
v0.7.105/doc/[id] Access panel grouped by principal — one row per user, actions as chips (D321)- Improved
Five rows per user collapsed into one
Previously each (principal, action) explicit grant rendered as its own row, so a user with read + write + share + delete + change-permission allow showed up as five nearly-identical rows — each with a "Revoke all" button doing the same thing (the underlying revoke tool already revokes by principal, not per-action). Now the panel groups grants by principal: one row per user shows their email, an "Allow" line with green action chips, a "Deny" line with red chips (when present), and a single "Revoke all" button. Sorted by email for stable ordering.
- Improved
Inherited grants get the same treatment
Collection-derived access rows are now grouped by (principal, collection) so a user who has inherited grants from two different collections sees both lines distinguished, but the per-action duplication within each collection is collapsed into chips. The "via collection X" link still surfaces.
v0.7.104InfoTooltip — click/hover ? icons explain domain terms in context (D320)- Shipped
Reusable <InfoTooltip> component
Small ? icon next to a label. Hover-on-icon-only opens the tooltip (intentional — won't fire as you move around the page); click toggles for touch devices; Esc closes via blur. Accessible: <button> with aria-describedby + aria-expanded; keyboard-focusable. Default position is below + right-of-icon (grows leftward, won't clip at viewport right edge); align="left" swaps direction for icons near the page's left edge.
- Shipped
Dashboard metric cards explain themselves
Four StatCard tooltips wired: Documents (live count, tombstones excluded), Collections (six-kind taxonomy + link to /browse), Extraction (pipeline overview + per-status definitions + drill-in pointer), Events (hash-chained audit log + 7d window). Closes the "what does Extraction even mean?" question for new operators landing on the dashboard.
- Shipped
/browse collection-kind tooltips
Each kind section header in the /browse left pane (Cabinets / Drawers / Matters / Projects / Folders / Custom) gains a tooltip explaining what that kind is and how it nests. Operators who don't already understand the taxonomy can hover to learn it without leaving the page. align="left" so the tooltips grow rightward from the narrow left pane.
- Shipped
Glossary help article
New /help/glossary article — plain-English definitions of every domain term (Collections, Cabinets, Drawers, Matters, Projects, Sensitivity, Retention, Legal Hold, Tombstone, DLP, Anomaly, Extraction with per-status meaning, Bates, Hybrid search, Quota, Cap warning, AI budget, BYO key, Override, Event, Stream, Audit-chain verification, etc.). Same definitions the in-app tooltips show; the article is the longer-form reference for operators who want depth.
v0.7.103Per-user default landing page after sign-in (D319)- Shipped
Pick where you land — Dashboard, Browse, Search, Upload, or Extraction issues
New "Default landing page" section on /settings/account. Sign-in now redirects through `/home`, which reads each user's preference and bounces to their chosen page (or /dashboard when unset). Per-user — your team-mates pick what fits their day. The dropdown is constrained to a fixed allowlist (Dashboard / Browse / Search / Upload / Extraction issues) so a crafted form post can't smuggle an open-redirect. Existing /dashboard bookmarks still work; only the post-sign-in landing changes.
v0.7.102/browse polish — readable tree indentation + sub-collections shown in the center pane (D318)- Improved
Tree indentation now reads like a tree (D318)
Per-depth indent bumped from 0.85rem to 1.5rem and a VS-Code-style vertical guide line renders for every ancestor level — children share a visible vertical rule with their siblings under the same parent. Leaf rows (no chevron) get a 1rem spacer that matches the chevron column so they align with their parent's link instead of creeping left. A drawer nested under a cabinet now actually looks nested.
- Shipped
Sub-collections shown in the center pane (Windows-Explorer behavior)
Selecting a parent collection now surfaces its children as a "Sub-collections" section above the document list. Two-column grid on sm+ breakpoints; each child renders as a bordered card showing the child name + kind label and links back into /browse with the child selected. Operators can drill in via the tree on the left OR by clicking the row in the center, matching Windows Explorer + every incumbent DMS. Empty-doc-list copy now distinguishes "no documents directly here, drill into a sub-collection" from the truly-empty case. Header line prefixes "N sub-collections ·" when any exist so the shape is visible at a glance.
v0.7.101Three new extractors close common unsupported-format gaps — legacy Word .doc, Outlook .msg, ZIP archive (D317)- Shipped
Legacy Word .doc extractor (D317.1)
`.doc` is the 1997-2003 Word binary format (Compound File / OLE container — magic bytes 0xD0CF). It is NOT the same format as `.docx` (OOXML zip — magic bytes 0x504B "PK"). mammoth handles `.docx` but not `.doc`; new `office-doc.ts` extractor closes that gap with the `word-extractor` package, which is pure-JS and runs cleanly in Inngest's serverless Node runtime (no Pandoc / LibreOffice install). Extracts main body + footnotes + endnotes + annotations; headers / footers / textboxes excluded as boilerplate that dilutes search relevance for legal contracts.
- Shipped
Outlook email .msg extractor (D317.2)
New `outlook-msg.ts` parses Outlook's saved-email Compound File container via `@kenjiuno/msgreader`. Output: a transcript-style text (Subject / From / To / Cc / Bcc / Date / Body) so the email reads like a memo when surfaced in /search snippets, plus a structured email-metadata block carrying typed sender/recipients + the attachment list. Common in legal discovery (clients drop a folder of saved emails) and in AEC RFI/submittal workflows where contracts route over email. Attachments are listed by name only — full auto-fanout (each attachment becomes its own document) shares the same primitive as .zip auto-fanout below.
- Shipped
ZIP archive manifest extractor (D317.3)
New `zip-archive.ts` uses JSZip to walk the archive and emit a text manifest listing every inner file (path + uncompressed bytes). The archive becomes searchable by inner file name — an operator who zipped 50 invoices into one archive can search "invoice-2024-04.pdf" and find the parent .zip. Refuses > 200 MB / > 5,000 entries to bound per-row memory. **Auto-fanout** (each inner file becomes its own document via the upload pipeline) is on the roadmap as a follow-on; today the manifest closes the unsupported-status gap.
- Improved
Extraction-issues page blurb updated for the new extractors
/extraction-issues "Unsupported" tab no longer lists .zip as a common cause (it's now supported). The blurb now surfaces the remaining gaps with per-format "how to fix" hints: .one (convert in OneNote → Print to PDF), .tar.gz / .7z / .rar (unzip locally), raw camera (.NEF / .CR2 — convert to JPEG), CAD (.dwg / .dxf — export to PDF), encrypted PDFs (remove password), video, and `application/octet-stream` (re-upload with correct extension).
- Improved
9 new eval-suite tests pin the new extractor wiring (eval count 187 → 196)
New `packages/evals/src/extractor-registry.test.ts` asserts: each of the three new MIMEs routes to the right extractor; .docx still routes to office-docx (not the new office-doc); the case-insensitive + charset-suffix MIME normalization works; an end-to-end manifest test on a synthetic JSZip archive verifies the structured output shape; the > 200 MB rejection fires cleanly. A future refactor that drops one of the new extractors fails loud here instead of silently surfacing as "still unsupported".
v0.7.100/browse first-view discoverability — per-kind create affordances, parent picker, tree rendering, welcome empty state (D316)- Shipped
/collections/new — accept ?kind= query + parent picker (D316.1)
createCollectionAction now reads parentId from FormData and threads it through to the MCP tool (which already supported the column). The page reads ?kind= and ?parent= searchParams so per-kind "+ New" affordances elsewhere can pre-fill the form. The client form gains a kind-aware parent dropdown — drawers see cabinets only; matters and projects can nest under cabinets; folders can nest anywhere; cabinets are top-level (picker hidden). Auto-clears the parent selection when a kind switch invalidates it. Inline "no cabinets exist yet" callout when a drawer-kinded form has no eligible parents.
- Shipped
/browse — per-kind create affordances + tree rendering (D316.2-3)
Each kind section header now carries a "+ new" link prefilled with /collections/new?kind=<kind>. A "+ New" button next to the "N collections" count opens the unfiltered form. Native <details>/<summary> elements render parent → child nesting with default-expanded state — a drawer with a cabinet parent appears nested under its cabinet rather than duplicated in the DRAWERS section. Empty kinds (no top-level collections) collapse out of the sidebar until you have one.
- Shipped
/browse — first-time welcome empty state (D316.4)
When the tenant has zero collections, the center pane swaps to a welcome card explaining all six kinds with a per-kind "Create <kind> →" CTA. Closes the "where do I even start?" question Sam hit on first view: two folders, no obvious path to create a cabinet or drawer.
v0.7.99Layout redesign — 2-pane /doc/[id], /browse 3-pane explorer, sidebar quick-access, drill-in for stuck files (D311–D315)- Fixed
Hotfix — /dashboard 500 from the v0.7.98 cost rollup
getTenantUsage used `sql`${kind} = ANY(${PAID_API_KINDS}::text[])`` — Drizzle parameterizes a JS array as a Postgres `record` type, not text[]. The runtime cast threw "cannot cast type record to text[]" on every /dashboard render and /api/inngest call. Fix: switched to Drizzle's `inArray(column, values)`, which knows how to format the list correctly. Inline comment now warns future contributors against the trap. Closed end-to-end by D315 below.
- Shipped
/doc/[id] 2-pane layout — preview sticky left, info flows right (D311.1)
Inspired by FileHold's Windows-Explorer-style layout where metadata sits next to the file, not below it. The /doc/[id] body grew from a max-w-3xl single-column scroll into an lg:grid-cols-12 two-pane layout: preview sticks at lg:col-span-7 with `lg:sticky lg:top-6 lg:self-start` so it stays visible while scrolling the right pane (col-span-5) full of access, metadata, retention, audit, and links. Below lg breakpoint it falls back to single-column so mobile + tablet still work. Tabbed refactor of the right pane (15+ sections collapsed into 6 tabs) deferred until we see how operators use the 2-pane in the wild.
- Shipped
Unsupported / failed / stuck drill-in — clickable extraction hint (D312)
The dashboard's "X stuck files" hint was a static number; you couldn't click through to see which files. New /extraction-issues page with three filter tabs (Unsupported / Failed / Stuck) showing every doc in the current state, permission-trimmed via `canReadDocument` + `aclSql`. Clickable rows → /doc/[id]. The dashboard counts now link to the matching tab. The earlier "what is unsupported?" question is now self-service.
- Shipped
Sidebar quick-access — Recent uploads + Saved searches (D313)
New "Recent uploads" NavGroup at the top of the (app) sidebar shows your top 8 docs (createdBy=me, ordered by createdAt desc) so you can jump back to whatever you just uploaded without /search. New "Saved searches" NavGroup shows your top 5 saved searches. Both lazy-load on layout render and don't block the rest of the nav. New /browse link added to the main nav.
- Shipped
/browse 3-pane explorer view — collections tree, doc list, metadata pane (D314)
FileHold-inspired tri-pane layout that preserves Kodori's metadata-first architecture (D2 — collections-as-views, no physical folder tree). Left pane (lg:col-span-3): collections grouped by kind (drawer / cabinet / matter / project / smart). Center pane (lg:col-span-6): docs in the selected collection with sortable columns. Right pane (lg:col-span-3): metadata for the selected doc. URL state: `?collection=<id>&doc=<id>` so links and back/forward work. Familiarity-with-incumbents win without giving up the architectural difference that makes Kodori better at retention + agent transactions + cross-collection search.
- Shipped
Regression test for the v0.7.98 record-cast SQL bug (D315)
Closed the testing gap that let v0.7.98 ship a query that broke at the postgres-driver layer (parameterized JS arrays sent as record, not text[]). New static-source-grep test asserts: no `${arr}::text[]` / `::uuid[]` / `= ANY(${arr}::type[])` in the billing files; usage.ts uses Drizzle's `inArray`; the inline "cannot cast type record" warning comment is present; every PAID_API_KINDS literal also appears in cost-tracker.ts (write path). Comments stripped before pattern matching so the warning that documents the trap doesn't self-trigger. Sub-second test, zero infra dependency.
v0.7.98D309 deferred-items shipment — per-tenant cost override + cost-cap warning + owner-dollar dashboard + BYO-key escape + customer-units UI (D310)- Shipped
Per-tenant cost-budget override (D310.1)
New `cost_budget_microcents_override` integer column on tenants (migration 0101). NULL falls back to plan default; -1 means unlimited; positive integer is the hard ceiling. Encodes Enterprise contractual ceilings + early-access bumps without touching plan tier. Override precedence wired through computeCostQuotaDecision with 6 new test cases pinning the rules (override -1 unlimits even Free; override 0 hard-caps; override-tighter-than-plan wins; override-looser-than-plan wins; null-override matches no-override; override on Enterprise encodes the contractual ceiling).
- Shipped
Cost-cap warning banner (D310.2)
AI budget cap surfaces in the existing dashboard CapWarningBanner — no new component, just a new "cost.budget" value on the QuotaKind enum that the banner picks up automatically. Customer sees label "AI budget" + utilization% only — never the raw dollar number. Honors per-tenant override precedence.
- Shipped
Owner-facing /admin/cost dashboard (D310.3)
Bookmark-only, owner+admin gated, follows the existing /admin/* pattern. Cards: this-month spent, effective monthly cap (honors override), utilization. Plan + override panel. Spend-by-vector table broken down per cost-event kind for the current month. 30-day trailing total. Override is read-only on the page (allowing role-holder of a tenant to raise their own override would defeat the gate); changes happen via direct SQL until a super-admin role lands.
- Shipped
BYO-key escape hatch — Business+ tenants paste their own Anthropic API key (D310.4)
Foundation: encrypted `byo_anthropic_key_encrypted` text column (migration 0102), AES-256-GCM via byo-key-vault (mirrors token-vault with distinct salt), `createTenantModelProvider(tenantId)` factory with 1-min in-memory cache, owner-only set/clear actions gated to Business+. Settings UI at /settings/byo-key with shape validation (`sk-ant-` prefix). requireCostQuota short-circuits to allowed when tenant has BYO key. Call-site migration: every Anthropic call now honors the tenant's BYO key — extract-document (claude-pdf, raster-convert), auto-classify, six typed extractors, agent loop, 5 server actions + the agent chat route. Embed-document deliberately not threaded (calls OpenAI). Operator pastes key once; every Kodori AI call from their workspace lands on their Anthropic account; cost cap becomes a no-op for those calls.
- Shipped
Customer-facing AI-budget UI on /settings/billing (D310.5)
New AI-budget UsageBar in the existing usage grid. Shows "75% used" / "Unlimited" / "BYO key — your Anthropic account" — never the raw dollar amount. UsageBar gained a `hideCap` prop so the right-side text honors `displayCurrent` verbatim. BYO-key callout under the usage grid (Business+ only) with link to /settings/byo-key. Per Sam's framing: customer sees units, owner sees dollars on /admin/cost.
- Roadmap
Resend email gate still deferred — bounded by upstream gates
Per-send cost ~$0.0004 makes per-call gating low-leverage relative to the 20+ sendXxxEmail refactor cost. Highest-volume email surfaces are gated upstream (alerts via the automation cost gate, invites via the seat cap, onboarding once-per-user). Revisit triggers: Sentry shows email cost trending unexpectedly upward, OR a customer ships an automation that fans out to 10k recipients, OR we add a non-Anthropic cost vector (Twilio, Slack) that justifies the same refactor pattern.
v0.7.97Structural cost-revenue invariant — every paid-API call site gated on a per-plan cost budget (D309)- Shipped
14 of 15 audited paid-API call sites now gated
D308 made the existing extraction count-gate actually work. D309 closes the rest of the audit. New `requireCostQuota` cost-microcent budget gate runs alongside the existing count caps; refuses when month-to-date paid-API spend would exceed `plan.monthlyCostBudgetMicrocentsPerSeat × seats`. Wired into auto-classify, embed, six typed extractors (AP invoice / receipt / change-order / inspection / RFI / submittal), template doc generation, automation NL compile, automation runner, privilege log builder (up-front against batch total), redaction suggestions, and canvas auto-plan. Each surface degrades cleanly on refusal — user-initiated actions return an error message; cron-driven actions skip silently; per-doc Inngest pipelines fall through with sensible defaults (auto-classify returns null fields, embed leaves the doc FTS-only, typed extractors persist a manual-fill row).
- Shipped
Plan-tier cost budgets at 50% margin floor
New `monthlyCostBudgetMicrocentsPerSeat` field on PlanLimits. Free $0.50/seat (loss leader, tight), Team $15/seat (50% of $30), Business $40/seat (50% of $80), Enterprise -1 (unlimited; per-tenant contractual override deferred). Numbers are tunables; revisit when actual usage shape lands. Pure decision logic in cost-quota-decision.ts (no server-only boundary) with 12 unit tests covering Enterprise short-circuit, seats clamp, negative-estimate clamp, upgrade-pointer phrasing, linear-scale-with-seats, and the 50%-of-price margin-floor invariant.
- Shipped
Customer sees units, owner sees dollars
Per Sam: "the cost is more valuable to us the owners than the customer." Existing count caps stay as the visible plan promises ("200 questions/seat/mo") — that's what /pricing markets and what the dashboard surfaces. The new cost cap is the structural floor that catches what counts can't see (Opus vs Haiku price ratio, prompt-cache hit rate, document size variance). Both gates run; either can refuse first; both honor the same plan tier.
- Roadmap
D310+ deferred — UI surface + BYO-key escape
Customer-facing units UI ("287 of 2000 used this month"), owner-facing dollar dashboard, BYO-key escape hatch (Business+ tenants paste their own Anthropic / OpenAI keys at /settings/billing — their costs hit their accounts, Kodori's gate becomes a no-op), per-tenant cost_budget_microcents_override for Enterprise contracts, Resend email gate (deferred from this shipment because per-send cost is ~$0.0004 and upstream entrypoints are already gated).
v0.7.96Cost-leak fix — `trackExtractionCost` was never plumbed into the main extract path (D308)- Fixed
Per-tenant extraction quota gate now actually counts
Roy's 13k bulk re-extract on 2026-05-08 burned ~$272 in Anthropic spend on a non-paying tenant, exposing that the `requireQuota("pdf.extract")` gate was running but its inputs were structurally always zero. `extract-document.ts` defined a `checkPdfQuota` dependency but never wired the matching `trackExtractionCost` callback that writes `cost_events WHERE kind="extract.pdf"`. The cost-tracker plumbing existed only on the external-connector extract path; the main upload-driven path silently skipped it. Fix: new optional `trackExtractionCost` dependency, called inside a dedicated `step.run` block after `persist-and-emit` succeeds. apps/web wires it to the existing `trackPdfExtraction` cost-tracker. Tokens pulled from `result.structured.usage` (claude-pdf already exposes them); non-LLM extractors fall through to 0 tokens and skip the row, correctly excluding free extractions from the AI-extraction cap.
- Fixed
Roy's test tenant moved off `enterprise` plan
Tenant `r3mcnett's workspace` was on `plan = "enterprise"` from early seeding. `plans.ts` defines Enterprise as `pdfExtractionsPerSeatMonthly: -1` (unlimited), and the gate's `if (cap === -1) return { allowed: true }` short-circuit auto-allowed every call. Tenant moved to `business` via SQL UPDATE so the 2000/seat/month cap applies (4000 cap with 2 seats; ~$80 max spend at current Claude vision pricing → 50% margin floor on $160/mo plan revenue). KumoKodo internal org stays on Enterprise — we're not gating ourselves.
- Roadmap
D309+ — cost-based budget refactor coming
D308 makes the existing count-based gate actually work; D309+ ships the broader architecture: cost-microcent budgets per plan tier (not extraction counts), BYO-key escape for Business+ tenants, customer-facing units UI ("287 of 2000 doc-actions used"), owner-facing dollar dashboard, expanded gate coverage to the 15 unprotected paid-API call sites identified in the audit (auto-classify, AI summary, DLP scan, AP invoice / receipt / change-order / RFI / submittal / inspection extractors, OpenAI embeddings, automation runner, privilege log, redaction suggestions, canvas auto-plan, Resend email).
v0.7.95/doc/[id] trust-surface fixes — Access panel no longer Cartesian-inflates + Metadata is editable + image lightbox drag works (D306 + D307)- Fixed
Access panel was Cartesian-inflated by a `users.id = users.id` leftJoin
Sam screenshotted Roy's /doc/[id] showing 6× read / 5× write / 4× share rows for the same email. Root cause: `listPermissionsTool` had a leftJoin to `users` with the condition `eq(users.id, users.id)` — a never-finished placeholder for the `principal_id::uuid = users.id` cast. `users.id = users.id` is true for every users row, so the join produced a Cartesian product and the panel showed N copies per real grant. Fix: drop the leftJoin entirely; rely on the existing second-pass email lookup (already correct, already in the handler). Aligns with `listCollectionPermissionsTool` which already used the right pattern. The "Revoke all" button kept working through the bug because revoke deletes by `(principal, action, resource)` and would clear all the cartesian-shadow copies in one call — but the trust signal "who actually has access" was wrong. See D306.
- Shipped
New unique index on `permissions` — DB-level enforcement of grant idempotency
Migration 0100_permissions_unique_grant.sql ships alongside the display fix. Step 1: dedupe existing rows via `ROW_NUMBER() OVER (PARTITION BY tenant_id, principal_kind, principal_id, action, resource_pattern, effect)` keeping the oldest, delete the rest — so any real dupes from a non-idempotent insert path get cleaned up. Step 2: `CREATE UNIQUE INDEX IF NOT EXISTS permissions_unique_grant_idx` so the four insertion paths' (`createDocument`, `grantCollectionPermission`, `grantPermission`, `anomaly-sweep`) idempotency-by-convention is enforced by Postgres, not just by application code. A future fifth path that forgets the existence check surfaces as a 23505 unique_violation labelled with the tenant via D304's Sentry tags rather than silently inflating the Access panel. Raw-SQL migration: needs `db:apply-pending` per the CLAUDE.md drizzle-kit gotcha. See D306.
- Shipped
Metadata panel — view + add + update + delete the `metadata` jsonb directly from /doc/[id]
Sam noted: "I see no metadata for the file when looking at it. No way to manually add metadata either." `setDocumentMetadataTool` had shipped since Phase 0 (wired into agent + REST + bulk + migration + revert) but the page surfaced ONLY the typed first-class fields (sensitivity, retention class, dates, hash, mime). New `<MetadataPanel>` renders every top-level key in a sortable key/value table with type-aware value rendering — arrays of scalars as chips, objects in collapsible `<details>`, scalars plain. New `setDocumentMetadataKeyAction` server action wraps the MCP tool with per-key form posts (matches the per-key audit shape — one `document.metadata-set` event per changed key). Value input parses as JSON first, falls back to literal string — `["Smith","Jones LLC"]` lands as a real array but typing `Smith Holdings LLC` without quotes still works. Empty value deletes the key. Permission gate stays in the tool (creator OR tenant admin/owner) — one source of truth. Banner pattern mirrors the existing accessGranted / accessRevoked / accessError searchParams flow (D299). See D307.
- Fixed
Image lightbox drag finally works — synthesized click no longer closes the modal
The click-to-zoom lightbox from v0.7.94 had `onClick={onBackdropClick}` AND `onMouseDown / onMouseMove / onMouseUp` on the same div — after any mousedown→mouseup, a synthesized `click` fired on the same div and (because the image had `pointer-events-none`) `e.target === e.currentTarget` was true, so the modal closed on every drag-release. Sam's symptom: "you zoom in the middle and now you can't see the top or bottom" — drag-to-pan appeared not to work. Fix: track `wasDragRef` across the down→move→up window (movement past 4px = drag); suppress the synthesized click when `wasDragRef.current` is true. Also dropped the `if (zoom <= 1) return` gate so pan works at 100% zoom — a 2,000px-wide TIFF overflows the viewport even at 1×.
v0.7.94Bulk re-extract handed off to a durable Inngest workflow — survives any tenant size (D305)- Fixed
Dashboard "Re-run for all" no longer silently no-ops on large tenants
Roy was added as an admin to a sister tenant and clicked the dashboard's "Re-run for all" button — spinner stopped after a while, no toast, stuck-count unchanged. Two-layer root cause. (1) `enqueueExtractAllPendingAction` did N sequential per-row Drizzle upserts before the single `inngest.send([])` call — at ~50ms per round trip on Roy's ~14k candidates the loop alone exceeded Vercel's 300s function timeout, so Inngest never received the events. (2) The Server Action threw on timeout instead of returning `{ ok: false }`; the BulkExtractButton had no try/catch around its `await`, so React's transition silently stopped the spinner without ever calling `setError`. The button stayed at "Re-run for all (N)" and the user saw "looks like nothing happened."
- Shipped
New Inngest function `extract-all-pending` does the bulk durably
New `document/extract.all-pending.requested` event + `buildExtractAllPendingFunction` (`packages/workflow/src/functions/extract-all-pending.ts`). The Server Action now does only auth + role-check + `count(*)` + single-event send + return — sub-second on any tenant size. The Inngest function re-queries the candidate set, bulk-upserts `document_content` to `pending` in 1,000-row chunks (Postgres caps statements at 65,535 bind params; 5 cols × 1,000 rows = 5,000 with safe margin), then fans out per-doc events in 1,000-event chunks (Inngest's 4 MB per-send cap; ~250 bytes per event = ~250 KB per chunk). Each chunk is its own `step.run` so Inngest's retry layer can replay an individual chunk without re-emitting earlier work.
- Shipped
Per-tenant concurrency-key 1 prevents double-fanout
`concurrency: { limit: 1, key: "event.data.tenantId" }` follows the §15.2 100M-doc enterprise-volume concurrency-key pattern (D279) — one tenant's bulk re-extract doesn't block another's, and rapid double-clicks from the same admin can't double-fan. Per-doc duplicate prevention is already handled inside `extract-document` itself, so the limit-1 here is about avoiding wasted work, not correctness.
- Improved
BulkExtractButton hardened against thrown server actions
`await` now wrapped in try/catch so a thrown action surfaces "Couldn't schedule the bulk extraction" instead of silently spinning then idling. Success copy updated to "Queued ~N — running in background" so users immediately see the work is in flight; the dashboard's 15-second auto-refresh ticks the count down as workers finish.
- Shipped
Eval count: 179 → 187
New `extract-all-pending-chunking.test.ts` pins the chunking math — `chunkArray` correctness on Roy's 14k-doc shape, plus assertions that `EXTRACT_ALL_PENDING_UPSERT_CHUNK × 5 < 65,535` and `EXTRACT_ALL_PENDING_FANOUT_CHUNK × 300 bytes < 4 MB / 4`. A future schema addition that bumps column count or a future Inngest cap change surfaces as a failing test instead of a silent prod regression.
- Improved
Image previews — click-to-zoom lightbox + larger inline view
The /doc/[id] inline image preview was capped at 640px tall with no way to enlarge — court-filing TIFFs and 200-300dpi scans were unreadable without right-click → "Open in new tab." Inline cap raised to 80vh, and a click on the preview opens a full-window lightbox with mouse-wheel zoom (toward cursor), click-drag pan, +/− buttons, "Fit" reset, and Esc / click-backdrop to close. Keyboard: `+` / `−` zoom, `0` resets. Applies to every image MIME including TIFF / HEIC / BMP that route through the server-side raster-convert pipeline (D297) — the lightbox sees the converted PNG regardless of the source format.
v0.7.93Sentry tenant + user tag enrichment — captured events ship with tenant.id + user.id + user.role tags (D304)- Shipped
Sentry events now ship with tenant.id + user.id + user.role tags
Closes the deferred-trigger from D301. New apps/web/lib/sentry-server.ts exposes enrichSentryScope() — sets the current Sentry async-local-storage scope's user (id only — NO email or PII) and tenant.id / user.role tags. Wired into the (app) layout (covers every authenticated page render via Sentry's scope-propagation through async-local-storage), the permissions Server Action toolCtx (the surface that caught Roy's Task #159), and the connector-migration requireAdmin helper (the surface where third-party REST API errors at scale need tenant context to triage).
- Shipped
PII-conscious posture by default
setUser ships id only — NOT email or display name. Database UUID is enough for cohort analysis without ever leaving Kodori's identity system; ops can look up email from id when triaging. Skips the need for a per-customer DPA conversation about Sentry's sendDefaultPii setting.
- Shipped
No-op when DSN unset
Same conditional-init posture as D301. The dynamic import("@sentry/nextjs") inside enrichSentryScope short-circuits before loading the SDK when NEXT_PUBLIC_SENTRY_DSN is unset, so deploys without observability wired up still work end-to-end with zero runtime cost.
- Roadmap
Future Sentry tag layers
feature.flag and plan.tier tags deferred until those primitives stabilize. Client-side scope enrichment (browser hydration errors, unhandled promise rejections) deferred — most triagable errors fire server-side and ARE tagged correctly. Other Server Action surfaces opt into enrichment as they ship throw-prone code paths.
v0.7.92NetDocuments migration importer flipped to beta — closes the "big three legal DMS" trio (D303)- Shipped
NetDocuments connector goes from skeleton to beta
packages/migration/src/connectors/netdocuments.ts flipped from status: "planned" (throwing not-implemented) to status: "beta" with a real REST implementation against /v1. Probe POSTs /v1/OAuth (form-encoded grant_type=refresh_token — different from iManage's client-credentials flow) to validate creds, GETs /v1/User/Repositories to count visible repos, warns when an operator-supplied repositoryId doesn't match. Discover POSTs /v1/Search with cursor-based pagination, scopes to a specific repository when configured. Download streams /v1/Document/{id}/Content.
- Shipped
Big-three legal DMS coverage achieved
Combined with iManage (beta — D258) and FileHold (beta — D302), every major legal-DMS displacement target now has a first-class importer. The /migrate/connectors surface lists three beta REST integrations + the production-ready S3-bucket connector — every common shopping path into Kodori has a clear answer.
- Shipped
Regional pod routing built into the credentials shape
NetDocs runs on regional pods (us / eu / au / ca) and the wrong region returns 401 even with valid creds. The region field is enum-typed, with optional apiHost override for private-cloud / reseller deployments. Operators pick correctly upfront rather than debugging "valid creds but rejected" later.
- Shipped
NetDocs profile fields round-trip into Kodori metadata
mapDocument() preserves NetDocs-native fields as namespaced metadata keys (netdocsId / netdocsCabinetName / netdocsDocType / netdocsRegion) AND spreads the per-customer profile dictionary so schema-defined fields (MatterNumber, ClientName, ResponsibleAttorney, DealCode, etc.) land as queryable Kodori metadata without operator pre-mapping.
- Shipped
11 mock-API end-to-end tests with cursor-pagination coverage
New describe block "NetDocuments connector — mock-API end-to-end (D303)" covers status, probe success + warning, discover-full (cursor-paginated walk across 2 pages), discover-scoped, discover-capped, discover-metadata-preservation, download-success, download-not-found, download-rejects-malformed-ref, AND auth-failure-clean (mock returns 401 on /OAuth, probe surfaces error cleanly). buildNetDocumentsMockFetch() simulates real two-page cursor behavior so the connector's next-following logic gets real exercise. Eval count: 170 → 179.
- Roadmap
Real-tenant pilot gates the flip from beta → ready
Same posture as iManage and FileHold — beta status badges the source as "tested on mock fixtures, awaiting real-tenant validation." The migration importer is feature-complete for the big three legal DMS displacement targets; the only remaining incumbent surface without a first-class importer is SharePoint, which is partially covered by the S3-bucket path via Microsoft's export tooling.
v0.7.91FileHold migration importer flipped to beta — REST connector with full probe / discover / download (D302)- Shipped
FileHold connector goes from skeleton to beta
packages/migration/src/connectors/filehold.ts flipped from status: "planned" (throwing not-implemented) to status: "beta" with a real REST implementation against /FH/api/v1. Probe POSTs /Token to validate creds, GETs /Cabinets to count the discoverable surface, warns when an operator-supplied cabinetId doesn't match. Discover() POSTs /Documents/Search paginated at PageSize: 100, scopes to a specific cabinet when configured, honors maxDocuments. Download() GETs /Documents/{id}/Content with Bearer auth.
- Shipped
FileHold hierarchy preserved end-to-end
Library → Cabinet → Drawer → Folder → File maps to Kodori's collections-as-views via path: [CabinetName, DrawerName, ...FolderPath.split(/[\\/]/)] — so deeply nested folders ("Acme Corp/Contracts/2024 Renewals") become individual collections in the imported tenant rather than a flattened single-segment name.
- Shipped
FileHold-native metadata round-trips into Kodori
mapDocument() preserves fileholdId / fileholdCabinetId / fileholdSchemaName as namespaced metadata keys AND spreads the document's MetaData dictionary so schema-defined fields (MatterNumber, ClientName, DocumentDate, etc.) land as queryable Kodori metadata without operator pre-mapping. Future "metadata mapping" UI can translate FileHold field names to Kodori field names with the source-of-truth still on the document.
- Shipped
Cloud + on-prem URL variants supported via override
apiBasePath credentials field defaults to /FH/api/v1 (cloud + on-prem 17+) but stays overridable for legacy /FileHold/api/v1 installs. Single connector handles every FileHold deployment shape we've seen so far without per-tenant code branches.
- Shipped
10 mock-API end-to-end tests under packages/evals
New describe block "FileHold connector — mock-API end-to-end (D302)" covers status, probe success, probe-warning-on-unknown-cabinet, discover-full, discover-scoped, discover-capped, discover-metadata-preservation, download-success, download-not-found, download-rejects-malformed-ref. buildFileHoldMockFetch() overrides globalThis.fetch with regex-keyed handlers (3 cabinets, 4 documents) so the suite runs offline. Eval count: 160 → 170.
- Roadmap
Real-tenant pilot gates the flip from beta → ready
beta status badges the source as "tested on mock fixtures, awaiting real-tenant validation." Operators kicking off a beta import opt into being the first real run. Flip to ready requires Kevin's outreach to land a pilot — outreach is in flight. Strategic context: KeyMark's rebuild of FileHold (likely on Hyland Nuxeo) is forcing FileHold customers into a re-platform window — Kodori's hash-chained audit + reversible agent transactions + AI-native UX is a stronger landing zone than another vendor port.
v0.7.90Sentry observability wired to D300 error boundaries — graceful no-op when DSN unset (D301)- Shipped
@sentry/nextjs 9.x with conditional init across server / edge / client
New apps/web/instrumentation.ts initializes Sentry on Node.js + Edge runtimes; instrumentation-client.ts handles the browser. Both gate every init call on NEXT_PUBLIC_SENTRY_DSN — when unset, the SDK becomes a complete no-op (zero runtime calls, zero bundle bloat, zero build errors). Lets us ship the integration before the Sentry account is wired AND lets self-host customers opt out cleanly.
- Shipped
D300 error.tsx boundaries now call Sentry.captureException with tag breadcrumbs
All three boundaries (app / marketing / global) tag captured events with boundary: 'app' | 'marketing' | 'global' so ops alerts can fire at different severities (global = fatal). Digest tag correlates Sentry events to Vercel runtime log lines. Console.error fallback preserved for grep-based triage when the Sentry dashboard isn't open.
- Shipped
next.config.ts wrapped with withSentryConfig for source-map upload
Source maps upload to Sentry on each build IF SENTRY_AUTH_TOKEN is set; without it, builds still succeed and ship minified (Sentry events still capture, just with less-readable stack traces). tunnelRoute: '/monitoring/sentry' routes client-side Sentry traffic through our own domain so ad-blockers don't drop captures — uBlock / Brave shields / corporate firewalls routinely block sentry.io directly.
- Shipped
Sample rates: 10% performance traces, 100% error capture, replay disabled
Free Developer tier (5K errors/month + 1 user) covers projected error volume comfortably. Replay capture (Session Replay) explicitly OFF — privacy + cost-control posture for legal-privileged content. Revisit when a customer requests it AND a privacy review clears the DOM-recording approach.
- Shipped
4 new env vars in apps/web/.env.example
NEXT_PUBLIC_SENTRY_DSN (load-bearing — required for any capture), SENTRY_AUTH_TOKEN (build-time, source-map upload), SENTRY_ORG (build-time), SENTRY_PROJECT (defaults to kodori-web). Set NEXT_PUBLIC_SENTRY_DSN in Vercel Project Settings → Environment Variables once the Sentry account is created and observability starts working immediately on the next deploy.
- Roadmap
D299 permission-action catch blocks NOT instrumented — by design
Caught errors there (wrong email, pending invite, hold-deny-wins refusal) are user-recoverable, not bugs. Capturing them in Sentry would generate noise without operational value. If we ever need validation-error rate tracking (e.g. detecting attacker probing) PostHog events or a custom log channel is the right primitive — not Sentry.
v0.7.89Three-layer error-boundary safety net — no more generic Server Components render 500s (D300)- Shipped
apps/web/app/(app)/error.tsx — friendly fallback for the authenticated route group
Triggers on any uncaught throw inside a Server Component or Server Action under (app)/** (dashboard, search, agent, /doc, /collections, /audit, /retention, etc.). Replaces Next.js's generic "An error occurred in the Server Components render" 500 with a friendly page that keeps the sidebar / nav visible and offers Try Again (calls reset() to re-render the failed segment in place), Back to Dashboard, and Help Center CTAs. Error message + digest visible behind a <details> disclosure for support correlation — operators copy the digest into a support ticket and we grep Vercel runtime logs.
- Shipped
apps/web/app/(marketing)/error.tsx — different visual treatment for anonymous prospects
Different audience needs a different error UX. Marketing visitors hitting an uncaught throw on /, /features, /pricing, /help/*, /compare/*, etc. get a confidence-preserving "we recorded this and you can keep browsing" message with Homepage + Help + mailto:hello@kodori.ai recovery paths. Digest visible inline at the bottom for support correlation.
- Shipped
apps/web/app/global-error.tsx — top-level fallback for crashes in the root layout itself
Triggers ONLY when the root app/layout.tsx itself throws (rare but real — font loading, providers, theme detection). REPLACES the root layout when it fires, so it ships its own <html> + <body> with inline styles to avoid throwing on the same broken path that caused the original failure. Uses <Link> instead of <a> to satisfy @next/next/no-html-link-for-pages lint.
- Improved
Console-side breadcrumbs with { message, digest, stack } shape
All three boundaries log to console.error with a structured shape Vercel runtime logs capture. Easy grep target for ops triage. Sentry SDK hookup deferred — current shape will swap cleanly to Sentry.captureException(error, { tags: { digest, route } }) when observability tier ships, no boundary-side code changes needed.
- Fixed
Closes the class of bug Sam reported on /dashboard
Sam noted seeing the same opaque "Server Components render error" on the dashboard. Vercel runtime logs over 3 days showed only Roy's /collections 500 (D299) — the dashboard event may have been a client-side render error that didn't surface as a server 500. Either way, the new error boundaries catch BOTH server and client render errors throughout the (app) tree, so even latent throws we haven't found yet now produce a friendly fallback instead of the framework default.
v0.7.88Friendly error banners on permission-grant flows — closes Task #159 (D299)- Fixed
Roy hit a 500 on /collections trying to grant Kevin access — closes the bug
Two-layer root cause. (1) grantCollectionPermissionTool correctly throws when the principal email doesn't resolve to an existing tenant user, but it conflated "never invited" with "invited but pending acceptance" — Roy had invited Kevin 21 hours earlier so the "Invite them first" message was misleading. (2) The Server Action wrapper didn't catch the throw, so it bubbled back to React as a "Server Components render error" 500 that crashed the page tree entirely. Fix: shared _principal-lookup helper distinguishes the two cases ("pending invite as admin, wait for acceptance" vs "no invite, invite from /members first"), and the four permission actions (grant + revoke × document + collection) now wrap the tool call in try/catch + redirect with ?accessError=... on failure or ?accessGranted=... on success.
- Shipped
Inline emerald (success) / red (error) banners on /collections/[id] and /doc/[id]
Pages read the new ?accessError / ?accessGranted / ?accessRevoked search params and render contextual banners above the access form instead of the page crashing. Operator gets an immediate, actionable signal — "kevin@... has a pending invite as admin but hasn't accepted yet. Once they sign in and accept, you can grant access" — with a /members link to resend the invite if needed.
- Improved
New shared _principal-lookup helper used by single-doc + collection grant paths
packages/mcp/src/_principal-lookup.ts queries users first, then on miss queries invites for a non-accepted non-revoked row, and throws the appropriate friendly error. Used by both grantPermission (single-doc) and grantCollectionPermission so a teammate who would have hit the bug on either form gets the same actionable error. Future permission paths (bulk grants, share-link permissions, dashboard-level access) reuse the helper.
- Improved
Symmetric coverage across all four permission actions
Same try/catch + redirect-with-search-params shape applied to grantDocumentReadAction, revokeDocumentPermissionsAction, grantCollectionReadAction, revokeCollectionPermissionsAction. One PR closes the same throw-and-crash class across the four paths instead of leaving three latent crash modes for the next user to hit.
v0.7.87claude-pdf extractor image content-block routing fix — every JPEG / PNG / GIF / WebP upload now actually extracts (D298)- Fixed
Silent production bug: every JPEG / PNG / GIF / WebP upload was failing extraction with "Non-PDF files in user messages" error from Anthropic
The claude-pdf extractor was sending all four image MIMEs AND PDFs through the AI SDK's type: 'file' content block. That block maps to Anthropic's document block which only accepts application/pdf. The AI SDK's image path uses type: 'image' (with field name `image`, not `data`) — Anthropic's separate `image` block. Bug had been live since whenever the AI SDK changed its content-block routing semantics; surfaced when Roy's tenant query showed 3,932 JPEGs all failing with the same error. Fix: branch on MIME — images use { type: 'image', image: bytes, mimeType }, PDFs keep { type: 'file', data: bytes, mimeType }.
- Shipped
New regression test packages/workflow/test/claude-pdf.test.ts
Mocks generateText to assert the exact content-block shape per MIME — pre-D298 the JPEG case would have failed the test for "field name is image, not data." 9 new tests covering supports() routing, PDF vs image content-block construction, oversize rejection, and upstream usage payload preservation. Workflow test count: 18 → 27. Future regressions (if the AI SDK schema changes again, or if someone "simplifies" the branch back to a uniform shape) get caught at test time.
- Improved
Combined with D296 (raster-convert) every image MIME now routes correctly
Native-supported images (PNG / JPEG / GIF / WebP) hit claude-pdf's now-fixed image branch. TIFF / BMP / HEIC / HEIF go through raster-convert which decodes via sharp and delegates to claude-pdf with a known-good MIME (image/png for single-page, application/pdf for multi-page). Both paths exercise content-block construction correctly post-D298.
- Roadmap
Per-extractor failure-rate alerting deferred
We caught this bug because a customer noticed thousands of failures, not because of an ops alert. A periodic Sentry / cost-tracker alert on document_content.status='failed' rate exceeding a tenant baseline would have surfaced this within hours of upload instead of days. Adding when extraction-failure observability becomes a real customer-driven need.
v0.7.86TIFF / HEIC inline preview rendering — server-side conversion at /api/doc/[id]/preview (D297, closes Task #158)- Fixed
Roy uploaded a TIFF, saw a broken preview on /doc/[id] — closes the bug
D296 closed the EXTRACTION gap (search + agent see TIFF text via the raster-convert extractor). It did NOT close the PREVIEW gap. Chrome / Firefox / Edge all refuse to render TIFF / BMP / HEIC / HEIF inline in <img> tags (Safari is the only mainstream exception); the preview endpoint was serving raw TIFF bytes with Content-Type: image/tiff and the browser silently failed to decode them. Roy filed Task #158 with a Network error on /api/doc/.../preview as the giveaway.
- Shipped
Server-side conversion in apps/web/app/api/doc/[id]/preview/route.ts
Detects TIFF / BMP / HEIC / HEIF source MIME, fetches bytes from R2 in-process, decodes via sharp (libvips + libheif — already a dep for OG-image generation), bounded to 2048px on the long edge, returns inline as image/png with Cache-Control: private, max-age=300. The original TIFF stays untouched in storage; /api/doc/[id]/download continues to serve the source bytes for users who need the authoritative format.
- Shipped
Multi-page TIFFs preview the first page only
Legal fax-scan TIFFs are typically 5-50 pages. <img> tags display exactly one image, so v1 surfaces page 1 and lets operators see something. The agent can still see every page (D296 wraps multi-page TIFFs into a PDF for Claude vision); operators who need to see all pages can Download. Future iteration can add a ?page=N query param + a page-picker UI.
- Shipped
X-Kodori-Preview-Converted-From debug header announces conversion source
Every conversion-path response carries the original MIME in the response header so operators investigating "why is this a PNG when I uploaded a TIFF?" find the answer in Chrome devtools without code spelunking. 50 MB cap on source bytes returns 413 with a download-link fallback message; conversion failures fall through to the legacy presigned-URL redirect path (browser still won't render but network resolves cleanly). Explicit runtime = 'nodejs' on the route so the native sharp binary loads.
- Roadmap
R2 derivative cache deferred — trigger is sustained preview-conversion volume
Caching the converted PNG in R2 keyed on <hash>.preview.png would shave per-request CPU but adds cache-invalidation complexity (versions, sharp upgrades, preview-cap changes, R2 lifecycle rules). The 5-minute browser cache header offloads the win at zero infra cost. Add the derivative cache when conversion volume drives a meaningful line item on the Vercel function-execution bill.
v0.7.85TIFF / BMP / HEIC / HEIF support — raster-convert extractor closes the legal-scan + iPhone-camera gap (D296)- Shipped
New raster-convert extractor in the extraction cascade
Claude vision rejects TIFF / BMP / HEIC / HEIF bytes directly. Without cloud OCR (Azure / Google DocAI) configured, every TIFF (the dominant scanned-evidence format in U.S. court filings — CM/ECF still emits TIFFs in older districts; legal fax-scan deliveries are usually multi-page TIFF) and every HEIC (iOS camera default since iOS 11 — every /capture upload from an iPhone unless the user switched to "Most Compatible") fell past every extractor and landed at status='unsupported'. New raster-convert extractor decodes via sharp (libvips + libheif) and routes the converted bytes through claude-pdf — single-page sources convert to PNG, multi-page sources (legal fax-scan TIFFs are typically 5-50 pages) convert page-by-page and wrap into a single PDF via pdf-lib that Claude reads in one vision call. 20 MB input cap + 50-page cap match Claude's per-request limits.
- Improved
Cloud OCR still wins when configured — raster-convert is the no-config fallback
Registry slot is between Google DocAI and Claude. When Azure or Google DocAI is configured (env vars set), those still handle TIFF / BMP natively at lower per-page cost + higher accuracy on dense scans. The new extractor only fires when no cloud OCR is configured for the tenant, closing the silent-dead-letter for fresh tenants without provisioning while preserving the cheaper / better path for high-volume scanned-record workloads.
- Improved
Cost-bearing — wired into the existing PDF-quota gate
Added to the checkPdfQuota allowlist in extract-document.ts alongside claude-pdf, illustrator-ai, whisper-transcribe. A Free-tier tenant uploading 100 TIFFs hits the monthly extraction cap before R2 egress on the quota-exceeded doc, same as direct claude-pdf — no bypass path through the conversion route.
- Shipped
Audit-event lineage records the conversion path
documentContent.extractor records the source format, the wrap shape, and the downstream extractor: e.g. "raster-convert:image/tiff->pdf->claude" or "raster-convert:image/heic->png->claude". Structured payload (structured.rasterConvert.{sourceMime, sourcePages, convertedPages, wrappedAs, truncated}) preserves the same data in machine-parseable form so operators investigating extraction cost or behavior have the full lineage on /doc/[id] and /audit.
- Roadmap
Raw-camera formats (RAW / NEF / CR2 / DNG) — deferred
Per-format quirks (NEF vs CR2 vs ARW vs DNG bit-depth + color-space variation) live in sharp's pipeline:dcraw build which is not enabled by default. Defer until a customer actually uploads one and asks. Most legal + AEC + accounting workloads don't see raw-camera bytes.
v0.7.84Stale-proposal expiry cron + rule-matched collection inheritance (D295)- Shipped
metadata-suggestion-expiry-weekly cron flips stale low-confidence proposals to 'expired'
Auto-classify proposes sensitivity / collection / keywords / doc-type / retention-class on every uploaded doc. When confidence is below 0.5 and the operator hasn't acted in 30 days, the proposal sits in the review queue forever and drowns the high-confidence proposals that actually need attention. New cron at Sundays 07:00 UTC runs a single atomic UPDATE that flips proposed → expired for stale low-confidence rows. Migration 0099 adds 'expired' to metadata_suggestion_status. Rows are NOT deleted — provenance preserved as audit-trail evidence. Re-classifications on a new version of the doc upsert on (documentId, kind) and clobber the expired row, so storage doesn't grow indefinitely.
- Shipped
applyCollectionInheritance gains includeRuleMatched flag — backfill rule-matched docs in addition to pinned
D294 shipped collection inheritance for pinned members only — a "matter X" rule that matches docs by docType + project ref couldn't apply the matter's sensitivity / retention to rule-matched docs that weren't explicitly pinned. New includeRuleMatched: boolean (default false) on applyCollectionInheritance walks the UNION of pinned members AND every live tenant doc that the collection's rule predicate matches via the existing compileCollectionRule helper. Same idempotent highest-tier-wins / no-override semantics as the pinned-only path. Rule-matched scan capped at 5,000 IDs per call; pagination via cursor for collections with rule matches beyond that.
- Roadmap
Lowest-wins / strict-equality inheritance modes deliberately deferred
No-customer-demand-and-real-footgun decision. Lowest-wins would silently DEMOTE sensitivity (a regulated PII doc dropped into a confidential matter loses its label) — every incumbent DMS that ships highest-tier-wins-only does so because the SOC 2 narrative can't defend silent demotion. We'd add these only with an acknowledgeFootgun: true flag if a customer's contract demanded them.
v0.7.83Collection-driven metadata inheritance + bulk JSONB metadata tool + bulk-tool description tweaks (D294)- Shipped
collections.default_sensitivity_label + default_retention_class_id — opt-in per-collection inheritance
Migration 0098 adds two opt-in nullable columns. Sensitivity inheritance is highest-tier-wins (escalates lower-tier members on collection-add, never demotes — a doc moved into a regulated matter becomes regulated; a doc that's already restricted moved into a confidential folder stays restricted). Retention inheritance is no-override (applies only when the member has no retention class yet — never overwrites an existing assignment because disposal cost compounds). Both writes commit/rollback atomically with the membership row insert so the audit chain stays consistent.
- Shipped
Three new MCP tools: setCollectionInheritance, applyCollectionInheritance, applyRetentionRuleToMatchingDocs
setCollectionInheritance configures the defaults (tenant owner / admin only) with optional applyToExisting=true for same-call backfill on small collections. applyCollectionInheritance is the paginated backfill for large collections — re-run with cursor until nextCursor === null. applyRetentionRuleToMatchingDocs backfills a retention auto-apply rule over existing docs whose proposed-or-accepted doc-type matches the rule's pattern; defaults to dryRun=true for preview-first UX (proposals land in metadata_suggestions for human review, mirroring the normal first-ingest path).
- Shipped
bulkSetDocumentMetadata MCP tool — bulk-patch metadata jsonb across a collection / saved-search / uncollected source
Closes the freeform-metadata gap in the existing bulk-tool family. Patches matter number, client code, parties, custom keys across up to 500 docs per call (paginated, hold-deny-wins where applicable). Idempotent on already-matched values. Rejects sensitivityLabel keys at the top of the handler — sensitivity has its own deny-wins gate that lives in bulkSetDocumentSensitivity; routing it through the generic patcher would bypass that gate.
- Improved
Bulk-tool descriptions lead with explicit "BULK / MANY-DOC TOOL — use this whenever the user asks to apply X to many documents" framing
Roy's testing surfaced an agent-discoverability problem: when asked to set metadata across all docs in a matter, the agent reached for single-doc tools instead of the bulk equivalents. Tool descriptions now lead with explicit phrasings ("set every doc in HR Records to restricted", "apply 7-year retention to all contracts") plus a "do NOT call setX in a loop" anti-pattern callout. Pure-string change, zero risk; should materially improve agent tool selection on collection-wide asks.
- Shipped
New collection.inheritance-set audit event captures every config change
Payload: { previousSensitivityLabel, nextSensitivityLabel, previousRetentionClassId, nextRetentionClassId, applyToExisting }. Emitted on the collection/<id> stream so the operator-facing audit timeline shows when inheritance changed + by whom. Committed atomically with the column update.
v0.7.82Inngest cost-optimization sweep — webhook fanout source-gate, cron slowdowns, step collapsing, per-tenant tracking (D293)- Improved
event/appended fanout suppressed at the source for tenants with no active consumer
apps/web/lib/event-store.ts now checks (with a 60-second per-tenant cache) whether the tenant has any active webhook subscription / automation / saved-search alert / citation alert before firing event/appended. Suppressed entirely when none exist (always-fire allowlist preserves legal-hold.applied + share-link.accessed for their type-filtered listeners). For trial / pilot tenants with zero subscriptions, this skips ~6 function executions per events.append call. The dominant Inngest billing line on idle tenants.
- Improved
anomaly-sweep cron */15 → hourly + cedar-divergence-cron 0 * → 0 */4
anomaly-sweep's default scan window is 60 minutes; the prior every-15-minutes cadence was redundantly scanning the same window 4× per pass. Hourly preserves coverage exactly while cutting cron volume by 75%. cedar-divergence's typical divergence rate is ~0% (the whole point of shadow-mode); every-4h with a widened 4h replay window keeps coverage gap-free at lower cadence. Both crons gained early-return guards when tenantList is empty so zero-tenant ticks no longer burn step iterations.
- Improved
step.run collapsing in the doc cascade — extract-document and auto-classify
extract-document merges enqueue-embed + enqueue-auto-classify into one batched enqueue-followups step (-1 step per success). auto-classify collapses six per-vertical type-cascade enqueue-* steps (invoice/receipt/RFI/submittal/change-order/inspection) into one enqueue-vertical-cascades with a single batched inngest.send([…]) (-up to 5 steps per classified doc). Regex narrowness for vertical matching preserved exactly; replay granularity preserved on the load-bearing steps (mark-running, fetch-blob, run-extractor, persist-and-emit).
- Shipped
workflow.invocations cost_event_kind — per-tenant Inngest invocation attribution
Migration 0097 adds the new enum value. trackWorkflowInvocations helper in apps/web/lib/cost-tracker.ts priced at the Pro included-quota rate ($75 / 1M = 7,500 microcents per invocation). Instrumented at the two highest-volume entry points: event-store.ts post-gate fire + apps/web/app/actions/upload.ts per-doc cascade root. Operators see per-tenant Inngest spend on /costs alongside Anthropic / OpenAI / R2 — single coherent "who's expensive this month" answer.
- Roadmap
Hybrid migration off Inngest deferred — trigger is monthly spend > $400-500
Building durable step memoization + per-key concurrency + retry/backoff + replay UI in-house is 2-4 weeks of eng investment. The four optimizations above should drop the bill 50-70%, leaving comfortable headroom under the 1M/month included quota for the next 5-10 small / medium customers. Hybrid migration (keep Inngest for document cascade, move crons + simple sends to Vercel Cron + Postgres job table) becomes the right next step IF Inngest costs cross ~$400-500/mo.
v0.7.81pgvector partial HNSW + recency filter on search (D291) — closes the within-Postgres half of §4- Shipped
Partial HNSW index over the recent-chunks working set
Migration 0096 adds document_chunks_embedding_recent_idx — a partial HNSW index WHERE created_at >= '2025-05-02'::date (one year before D291 ship date) covering the recent-chunks working set. Static cutoff required because Postgres partial indexes only accept IMMUTABLE expressions; operator runbook at docs/runbooks/pgvector-partial-index.md documents the quarterly rebuild via CREATE INDEX CONCURRENTLY + atomic rename. The global HNSW index stays in place to cover unfiltered queries.
- Shipped
recencyWindowDays filter on semanticSearch + hybridSearch
Optional input narrows the WHERE clause to chunks within the last N days, letting Postgres pick the partial HNSW index. Compose with sensitivityLabels (D285) for "confidential docs from last quarter" — both filters narrow the semantic leg. Most workloads are recent-content (matter / project queries, "what changed?", compliance-current); unfiltered path stays available for full-history retrieval.
- Roadmap
Native Postgres partitioning + external vector DB — deferred per §4 plan
Full per-tenant + per-quarter table partitioning (~10-15d) + pivot to Pinecone / Qdrant / Weaviate (parallel pipeline at $2-5k/mo for 1B vectors) stay deferred until a real prospect signs an LOI conditioned on > 50M doc semantic search. The runbook documents both escalation paths so the choice tree is one place to look when needed.
v0.7.80Per-tenant search latency P95 surface on /admin/queue-depth (D290)- Shipped
query_latency_samples table + 10%-sampled instrumentation on the search tools
Closes §3 #3 from the enterprise-volume plan. Migration 0095 adds a query_latency_samples table; new withQueryLatency wrapper around searchKeyword / hybridSearch / semanticSearch records latency samples at a 10% probabilistic rate (configurable via QUERY_LATENCY_SAMPLE_RATE env). Fire-and-forget — instrumentation never blocks the query return.
- Shipped
Search latency · last 7 days · sampled section on /admin/queue-depth
Per-(query_kind) P95 latency (Postgres percentile_cont) over a 7-day window with amber/red banding at 500ms / 2000ms. Sustained red P95 means search is over the comfort zone for your tenant volume — consider tier upgrade or narrow queries with sensitivity / MIME filters (D285). Renders only when at least one sample exists.
- Roadmap
Per-tenant rate enforcement — deliberately deferred
The harder version of §3 (a per-tenant query-rate hard cap) would block legitimate use cases like auditor evidence packets and retention sweeps. The soft P95 surface is a nudge, not a gate. Per-tenant config overrides land if a customer disputes the static bands; sample-table prune cron lands if growth becomes a real concern.
v0.7.79/audit "Verify chain integrity" button — per-partition results inline (D289)- Shipped
verifyAuditChainAction returns per-partition results
Mirrors the D288 cron — calls verifyAuditChainPartition per partition and returns a per-partition result list instead of one whole-chain pass/fail. Operators clicking "Verify chain" on /audit see WHICH partition broke if anything fails, instead of a single ambiguous "mismatch detected."
- Improved
Per-partition 50k cap replaces the legacy global 50k cap
A 200k-event tenant spread across 8 partitions of 25k each verifies cleanly without truncation. When a single partition exceeds 50k events, that partition's row shows truncated: true so operators see the cap explicitly. The weekly cron walks every partition without a cap so full-tenant verification stays preserved on the cron path.
- Improved
AuditVerifyButton UI surfaces per-partition status list
Compact list with green/red badge per row. Failed partitions get a separate red panel with first-mismatch detail. Empty-tenant case still renders a heartbeat '(pre-partition)' row so a fresh tenant gets explicit "0 events, all clean" instead of looking like a broken button.
v0.7.78Audit chain verification cron walks per-partition (D288)- Shipped
audit-chain-verify-weekly cron decomposed by partition
Companion to D287. The weekly Sunday 02:00 UTC verify cron now walks per-partition via verifyAuditChainPartition instead of the whole-chain verifyAuditChain. Per (tenant, partition) it emits one audit.verification.completed event with partitionKey in the payload — operators investigating a failure can pinpoint WHICH partition broke instead of "the whole chain failed somewhere."
- Improved
Empty-tenant verification heartbeat
Tenants with no events still emit a single empty verification event (partitionKey: '(pre-partition)') so the per-tenant verification stream stays continuous — auditors checking "does Kodori verify the chain weekly for every tenant?" see entries even for tenants whose chain happens to be empty.
- Roadmap
Cursor-paginated cron + /audit partition-aware UI — deferred
cron_checkpoints cursor pagination (D282 primitive) activates as a follow-on if a future tenant's partition count exceeds what one weekly run can handle. On-demand /audit "Verify chain" button gaining "last verified: <partition>" status is the natural next web shipment.
v0.7.77Audit chain partitioning — chain-of-chains for 100M-doc-tenant scale (D287)- Shipped
chain_partition_key on events + per-partition verifier
Closes the schema + append + verifier sub-items of §5 from the enterprise-volume plan. Migration 0094 adds a NULLABLE chain_partition_key text column on events (no backfill — soft cutover by design). runAppend looks up prev_hash in the current partition first, falls back to any-partition for the cutover + quarter-rollover paths. The pure-function verifyChainRows extended with allowNonNullGenesis option for chain-of-chains semantics. New verifyAuditChainPartition + listAuditChainPartitions exported.
- Shipped
quarterKey + previousQuarterKey pure helpers
New packages/events/src/partition.ts exports UTC-anchored YYYY-Q<n> derivation. UTC keeps partition boundaries deterministic across operator zones; previousQuarterKey returns null on malformed input rather than throwing so the future cron cursor can reset on corruption. 12 vitest fixtures cover the boundaries.
- Roadmap
Cron refactor + /audit button update — deferred follow-on
D287 ships the load-bearing append + verify primitives in isolation. The audit-chain-verify-weekly cron walking one partition per run via cron_checkpoints (D282 primitive) + the on-demand /audit "Verify chain" button surfacing "last full verification: <partition>" status are the natural next shipment.
v0.7.76/search stricter relevance defaults — inline tip at 1M+ tenant doc count (D286)- Shipped
Non-blocking tip when an unfiltered query lands on a 1M+ tenant
Fires only when a query is set AND no sensitivity / MIME filter is active AND the tenant has crossed the 1M live-doc threshold. Inline (not modal), explicitly framed as a suggestion not a block. Operators sometimes legitimately scan unfiltered (auditor evidence packets, retention sweeps); the tip surfaces the partial-GIN-index speedup (D285) without changing query semantics.
- Shipped
Bounded LIMIT 1_000_001 probe for tenant size detection
Detect "tenant has > 1M live docs" via a LIMIT 1_000_001 index probe at page-load. At MVP scale near-free; at 100M-doc-tenant scale the LIMIT caps the scan at 1M+1 index entries — bounded cost regardless of true tenant volume. Probe only runs in the warning-eligible state (query set + no filters active) — at-rest /search visits skip it entirely.
- Roadmap
Per-tenant query budget — last §3 sub-item still open
Track P95 query latency per tenant; surface "your search index is over the comfort zone, consider tier upgrade" on /admin/queue-depth (D280) or /costs (D281). ~2 days. Ships when a customer signal makes it important.
v0.7.75FTS partial GIN indexes + sensitivity filter on search (D285)- Shipped
Two partial GIN indexes shrink the FTS hot path at 100M-doc-tenant scale
Migration 0093 adds doc_objects_fts_live_idx (partial WHERE status='live') — automatically picked by every search since every query already filters live; AND doc_objects_fts_high_sensitivity_idx (partial WHERE sensitivity_label IN ('confidential','restricted','regulated')) — ~1/10th the size, ~10× faster for compliance queries. The original doc_objects_fts_idx stays in place to cover tombstoned-inclusive + unfiltered searches.
- Shipped
sensitivityLabels filter on searchKeyword + hybridSearch
Optional input narrows the WHERE clause to the listed tiers, letting Postgres pick the new partial GIN index. Threaded through hybridSearch by passing into the keyword leg AND post-filtering the semantic leg using the metadata hybridSearch already loads. Compliance operators routinely narrow to high-sensitivity tiers; the agent's default search tool now honors that narrowing.
- Roadmap
Stricter relevance defaults + per-tenant query budget — deferred from §3
The other two §3 sub-items from the enterprise-volume plan stay open. Stricter relevance defaults (1d): inline warning when a tenant's doc count crosses 1M AND the query has no filter dimensions. Per-tenant query budget (2d): track P95 query latency per tenant + surface "your search index is over the comfort zone" on /admin/queue-depth (D280) or /costs (D281). Both ship when a customer signal makes them important.
v0.7.74§2 index audit — two targeted index additions for 100M-doc-tenant hot paths (D284)- Shipped
collection_members reverse-lookup index
New (document_id, collection_id) index closes the deny-wins gate's reverse-lookup join inside canReadDocument. The PRIMARY KEY (collection_id, document_id) covers forward queries; the new index covers the reverse direction. At 100M docs × 2-3 collections each = 200-300M rows, this is the difference between O(log n) and O(n) per permission-trim.
- Shipped
events latest-of-type index for /admin/cron-status + cedar-divergence
New (tenant_id, type, created_at) index makes "most recent event of type T for tenant X" an O(log n) lookup. The existing (tenant_id, type) index required a post-scan sort by created_at; this matters for /admin/cron-status (D278) which fires this query for every cron-emitted event type per visit + the cedar-divergence-cron (D250) per-type recent-window scans.
- Improved
Five tables explicitly audited as needing no change
permissions, document_versions, document_chunks, cost_events, document_objects all reviewed with their query patterns + existing indexes documented in stack_decisions D284. The permissions table existing (tenant_id, resource_pattern) covers both per-doc point lookups + the LIKE 'collection/%' prefix scan efficiently — no additional indexes needed.
v0.7.73Bulk MCP tools cursor-paginated for 100M-doc-tenant scale (D283)- Shipped
cursor + nextCursor on the four bulk MCP tools
Closes the §7 follow-on from docs/plans/enterprise-volume-100m-plan.md. The 2,000-cap-per-call safety floor stays unchanged; cursor pagination lets the UI / agent loop calls until nextCursor === null to drain a source set larger than the cap. Affected: bulkAddDocumentsToLegalHold, bulkAddDocumentsToCollection, bulkSetDocumentRetentionClass, bulkSetDocumentSensitivity.
- Shipped
Opaque base64url cursor with total decoder
Cursor format: base64url-encoded JSON {lastDocumentId: string} with a malformed-input-tolerant decoder (decodes to null instead of throwing — an agent that hands back a corrupted string restarts at row 0). Source queries gained explicit ORDER BY documentId ASC for stable total ordering. Per-doc audit semantics unchanged — the existing one-event-per-changed-doc pattern naturally batches across calls.
- Improved
saved-search source naturally bounded — cursor not applicable
Cursor support applies to collection + uncollected sources. saved-search source ignores cursor input (runSavedSearch caps at 50 hits per call internally; bulk per-call limit drains it in one call). nextCursor is always null for saved-search so an agent loop terminates correctly after one call.
v0.7.72Auto-delete sweep cursor-paginated for 100M-doc-tenant scale (D282)- Shipped
document-auto-delete-sweep cursor-paginated via cron_checkpoints
Closes the §8 follow-on from docs/plans/enterprise-volume-100m-plan.md. The auto-delete cron previously walked the eligible set capped at 200 rows per run with no resume state; at 100M-doc-tenant volumes a backlog larger than the daily cap would never drain. New (autoDeleteAt, id) ordered scan with a cursor resume clause means consecutive daily runs incrementally drain a backlog instead of always restarting at row 0. Per-run cap bumped 200 → 2000.
- Shipped
Generic cron_checkpoints table (migration 0091)
New cron_id text PRIMARY KEY + freeform cursor_data jsonb table — generic for any cursor-paged cron. First user is document-auto-delete-sweep; future users include the §5 audit-chain partition verifier + the retention disposal scan. Cursor shape is per-cron, no table-versioning headaches.
- Shipped
Pure-function cursor-state transition + 6 vitest fixtures
nextAutoDeleteCursorState decides save-vs-reset based on batch size: saves cursor when batch fills the cap (more work likely remains), resets when batch comes up short (drained this pass; next run starts fresh to catch any past-set autoDeleteAt). Cursor advances on every processed row including hold-blocked + failed, so a stuck row never wedges progress. Test count up from 79 → 85.
v0.7.71Cost dashboard scales to 100M-doc-tenants — daily rollup table + cron (D281)- Shipped
cost_events_daily_rollup table + cost-events-rollup-daily cron
Closes the §9 follow-on from docs/plans/enterprise-volume-100m-plan.md. At 100M-doc-tenant volumes the cost_events table accumulates millions of rows per day; /costs page-loads were scanning ever-growing slices on every aggregation. New rollup table keyed on (tenant_id, day, kind) is filled by a daily 06:00 UTC Inngest cron via INSERT ... ON CONFLICT DO UPDATE — idempotent on retry. Single-SQL-pass cross-tenant; the cron processes every tenant in one run.
- Improved
/costs page-load cost is bounded regardless of underlying volume
Aggregations (30-day totals, by-kind, 14-day trend) now read from the rollup for completed UTC days and from raw cost_events for today only. The merge happens in JS over O(K) cost-kinds (~9 kinds today). Top-cost detail rows continue to read raw events — the rollup intentionally loses per-event identity to keep storage cheap. Page-loads remain sub-second at any tenant size.
- Roadmap
90-day cost_events prune cron — deferred follow-on
D270's stripe-events-prune is the precedent for unbounded-table prune crons. Today's shipment makes /costs aggregation-cost bounded; raw-event growth is a separate concern that doesn't bite until cost_events itself becomes operationally problematic. Defer-trigger: cost_events row count or backup-size becomes problematic on its own.
v0.7.70/admin/queue-depth — per-tenant ingest-pipeline backpressure surface (D280)- Shipped
/admin/queue-depth page
Sister surface to D278's /admin/cron-status. Cron-status answers "did the cron run?"; queue-depth answers "is the work backing up?" Surfaces extraction + auto-classify pipelines per-tenant: pending / running / stuck / recently-failed counts plus last-hour throughput (documents created, extraction-requested, extraction-completed, classification-requested). Tone-coded health rollup at the top — worst-band-wins across pipelines.
- Shipped
Closes the §6 backpressure-surface follow-on from the enterprise-volume plan
docs/plans/enterprise-volume-100m-plan.md §6 had two halves: per-tenant concurrency keys (D279) and a backpressure surface. D280 closes the second half. Single SQL roundtrip per visit — six tenant-scoped count aggregates running in parallel. No external dependency on Inngest's run-state API; the data-state snapshot from document_content + last-hour event counts is operationally sufficient.
- Roadmap
Webhook / digest / alert-dispatcher queue depth — deferred
Those pipelines don't have document-table state to query and would require either an inngest_run_snapshot table written by a 1-min cron polling Inngest's API, or new completion-event types per pipeline. Deferred until customer signal — extraction + auto-classify cover the operational reality of "is ingest backing up?" today.
v0.7.69Per-tenant Inngest concurrency keys + cron single-flight retrofit (D279)- Shipped
Per-tenant concurrency keys on 17 event-driven Inngest functions
Every per-document / per-event Inngest function with a global concurrency cap now declares concurrency: { key: 'event.data.tenantId', limit: N } so the cap is scoped per-tenant rather than across the whole fleet. Affected: auto-classify, embed-document, extract-document, the 6 specialized AP/AEC extractors (ap-invoice, ap-receipt, change-order, inspection, rfi, submittal), webhook-deliver, webhook-fanout, digest-send, citation-alerts-dispatcher, saved-search-alerts-dispatcher, external-search-alerts-dispatcher, share-link-access-notifier, legal-hold-object-lock, automations-event-dispatcher. At 100M docs single-tenant, global caps meant one busy customer's bulk import would starve every other tenant's classifier / extractor / webhook delivery — per-tenant keys make the platform tenant-isolated under load.
- Shipped
Single-flight on five bare cron functions
Sister-fix to D270's stripe-events-prune cron-collision-class catch in v0.7.65. Five crons that previously declared no concurrency directive (automations-tick, api-key-expiration-sweep, document-auto-delete-sweep, annotation-stale-resolver, onboarding-drip-sweep) gained concurrency: { limit: 1 } to match the audit-chain-verify-weekly + cedar-divergence-cron + Object Lock verify (D267) + Stripe prune (D270) precedent. Every cron in the platform now declares concurrency explicitly — none silently rely on Inngest's no-cap default.
- Shipped
Enterprise-volume 100M plan §6 workstream — concurrency half complete
docs/plans/enterprise-volume-100m-plan.md §6 (Inngest concurrency for 100M-doc scale) flips to ✓ for the per-tenant-keys half. Plan annotated against current code state + cross-referenced to D250 (Cedar divergence cron), D267 (Object Lock verify cron), D270 (Stripe prune cron), D278 (admin cron-status). The §6 backpressure-surface follow-on (per-tenant queue depth on /dashboard) remains pending. Concurrency-config-only retrofit — no migrations, no new event types — pnpm typecheck passes across all 11 packages.
v0.7.67Knock-out pass — canvas Phase 5 + Cedar policy snapshot + tenant flip-flag + /case-studies + /admin/cron-status (D274-D278)- Shipped
Canvas Phase 5 — per-node cascadeAdvanceTool opt-in (D274)
Per-human-approve-node opt-in flag (input.cascadeAdvanceTool: true). When approved, immediate tool-call children auto-advance via the existing canvas-runner.advanceCanvasNode. Preserves the operator-driven-default contract for nodes that don't opt in. Failures don't propagate.
- Shipped
Cedar divergence activePolicySnapshot (D275)
Divergence event payload now carries activePolicySnapshot: [{policyId, policyName, activatedAt}] capturing every active policy at observation time. Enables manual per-policy narrowing during /policies/observability investigation when N > 1 active policies exist. Cedar's authorizerInfo.determiningPolicies is unavailable on Deny results so programmatic attribution isn't possible.
- Shipped
Per-tenant cedar_authoritative + request-flip (D276)
New boolean column on tenants (default false). New requestCedarFlipAction emits tenant.cedar-flip-requested audit event but does NOT itself flip the column. KumoKodo security review is the load-bearing manual step. Migration 0089. New event type added.
- Shipped
/case-studies IA scaffolding (D277)
Empty-list pre-launch state with per-vertical placeholders + design-partner CTA. CASE_STUDIES registry shape ready; first add is a one-file-PR (slug + page). Honest pre-launch state cross-links to evidence-of-substrate at /security, /security/policies, /legal/*. /case-studies + /case-studies/* added to middleware PUBLIC_PATHS + PUBLIC_PREFIXES.
- Shipped
/admin/cron-status dashboard (D278)
Reads events table for most-recent-per-cron-event-type per tenant. 7-day count + last-payload preview. Owner / admin gated. Surfaces audit-chain-verify-weekly + cedar-divergence-observation + tenant-kms-rewrap. Crons that don't emit per-tenant events are not surfaced — future cron_runs table is the deferred follow-on.
- Fixed
Empty-state polish on /retention + /collections
Expanded one-line empty messages into 2-paragraph explanations with cross-links to /help. Builds on v0.7.66 polish for /canvas + /policies.
- Shipped
DNS subdomain runbook
New docs/runbooks/dns-and-subdomains.md documents the api.kodori.ai + status.kodori.ai deferral rationale + flip trigger conditions + cost analysis (Vercel-hosted vs Atlassian Statuspage vs Cloudflare Worker).
- Roadmap
Watch-folder sidecar CLI — explicit defer
Originally on the knock-out list but explicitly deferred to a future shipment — Node CLI for ongoing folder-watching ingest is more substantive than a follow-on can absorb (~1 day proper). Migration CLI (D256) covers the migration-time use case adequately.
v0.7.66README cleanup + canvas/policies empty-state polish (D273)- Fixed
README.md drift correction
"Phase 0 status" → "Status" with current shipment-log pointer (v0.7.65 / D272). Layout block expanded with packages/workflow + packages/migration + packages/sdk + packages/evals (previously omitted) + docs/security-policies + docs/distribution + docs/runbooks. "What's running today" enumeration replaces stale "tool handlers throw not implemented (Phase 0 stub)" prose. Open-items section calls out HIPAA + ISO customer-anchored + SOC 2 audit-pending status.
- Fixed
Empty-state polish on /canvas + /policies
Expanded the one-line "type a goal above" / "author your first one above" hints into 2-paragraph explanations. /canvas surfaces the LLM-cannot-cause-side-effects guarantee at the moment operators first land. /policies surfaces the shadow-mode posture + 30-day soak window + flip protocol with cross-link to /help/policies.
- Roadmap
Cedar-divergence prune cron — explicitly NOT shipped
Initial plan was to ship a prune cron parallel to D270 stripe-events-prune. Investigation showed events table is the immutable hash-chained audit log; deleting from it would break the chain. stripe_processed_events is a transient dedup table (D268) — categorically different lifecycle. Recorded as deferred-with-rationale.
v0.7.65Stripe prune cron + migration CLI eval coverage + middleware bare-path gap + tool-count drift (D270-D272)- Shipped
Stripe webhook prune cron (D270)
New stripeEventsPruneFunction Inngest cron Sundays 04:00 UTC (1 hour after Object Lock verify) deletes stripe_processed_events rows where processed_at < now() - 30 days. Companion to D268 webhook idempotency. Single-flight concurrency prevents overlapping prunes; count-before + count-after delta in the cron output for Inngest run-history visibility.
- Shipped
Migration CLI eval coverage — 24 new fixtures (D271)
packages/evals/src/migration-cli.test.ts locks in regression coverage for the CLI parser shipped at D256. Command dispatch (4 cases), flag parsing (7 cases), rejection cases (8 cases), edge cases (5 cases). Re-exported parseArgs + CliArgs from packages/migration/src/cli.ts via new ./cli exports entry. Pure-function — sub-millisecond, no DB / network / subprocess spawn. Total tests: 139 evals + 73 web = 212.
- Fixed
Middleware bare-path gap caught during Office surface review (D272)
Office add-in review caught that /office (bare path) was redirecting to /sign-in. middleware.ts had /office/ as a prefix but no bare-path entry. Next.js routing does not auto-canonicalize trailing slashes, so /office did not match. Same gap existed for /share, /invite, /legal-hold-ack, /legal. Added all 5 to PUBLIC_PATHS. Sister fix to D260 — same lesson, different shape.
- Fixed
Tool-count drift — "60+ typed tools" was understated by 100 (D272)
help-articles.ts smart-automation entry + /features smart-automation tile both said "60+ typed tools" when MCP_STATIC_TOOL_COUNT in packages/mcp/src/tools/index.ts is 159 (≈155+ rounded down for marketing). Drift since the count last grew. Updated both. The tool count was last formally tracked in feedback_scope_docs.md memory which auto-flagged the discrepancy on a per-session check.
v0.7.64Object Lock verify cron + Stripe webhook idempotency + pilot runbook + help-articles drift correction (D266-D269)- Shipped
Object Lock verify-and-extend weekly cron (D267)
Backstop for D254 event-driven Object Lock apply. Sundays 03:00 UTC (1 hour after audit-chain-verify). Walks every active-hold-bound blob, reads current Object Lock state via new getObjectLockStatus helper, re-applies via applyObjectLockRetention when retention drops below 90 years from now. New pure function shouldReapplyObjectLock exported for unit tests. Catches failure modes the happy path misses: Inngest fanout failed mid-hold, OBJECT_LOCK_ENABLED flipped on after holds existed, bucket-side change reset retention, holds approaching their retainUntilDate. Skips entirely when OBJECT_LOCK_ENABLED is false. Emits tenant-kms.rewrap-progress events with kind: object-lock-extend discriminator.
- Shipped
Stripe webhook idempotency hardening (D268)
New stripe_processed_events table with event_id text PRIMARY KEY + tenant_id + event_type + processed_at. The webhook route claims an event via INSERT ... ON CONFLICT DO NOTHING BEFORE running the handler — first delivery wins, retries observe the conflict and return 200 immediately. Stripe retries up to 3 days; without dedup the same event.id re-emits audit events + re-applies subscription state + clobbers newer updates. Migration 0088_stripe_processed_events.sql. Atomic claim via the unique-constraint-IS-the-dedup-mechanism — no in-memory cache, no audit-event lookup.
- Shipped
First-customer-pilot operator runbook (D266)
New docs/runbooks/design-partner-pilot.md documents the 8-week pilot motion split into per-week operator-side + customer-side tasks. Discovery agenda + post-call deliverables (Day 0). Tenant provisioning + Object Lock / BYO-KMS provisioning (Week 1). Connector probe + discovery preview + dry-run via the migration CLI (Weeks 2-3). Dual-write + auto-classification tuning + sensitivity confirmation (Weeks 4-6). Go/no-go decision with explicit success criteria + outcome A (expand) + outcome B (park) workflows (Weeks 6-8). Failure-mode protocols inline (Sev 1/2/3 incident-response). Caught + fixed a real bug during authoring: packages/migration/package.json was missing the bin entry + cli script — a prior Edit had silently failed. Added explicit Write of corrected file + tsx devDep. pnpm --filter @kumokodo/migration cli list now runs end-to-end.
- Fixed
Help articles drift correction — soc2-controls-mapping + canvas (D269)
soc2-controls-mapping article replaced "Live / Phase 1 / Phase 3 (audit)" status labels with "Live today / Roadmap / On audit engagement" matching the D260 cert-status revamp. Cross-references updated to point at the cert table + /security/policies. Canvas article added an "Auto-plan from goal" section (D264) and a "How cascade works" subsection (D263); expanded four node kinds to describe D261 advance + D263 cascade + D265 reduce activation with three reducer kinds; expanded audit-event list from 8 to 11 event types. Future shipments add D-decision triggers in stack_decisions that mention "update help article slug X" so drift correction becomes part of the ship checklist.
v0.7.63Canvas Phase 4 reduce node + 29 eval fixtures + /demo 5th moment (D265)- Shipped
Canvas Phase 4 reduce node — fan-in multi-parent activation (D265)
Reduce nodes declare their parents in input.parentNodeIds (1-20 named); the new maybeActivateReduces walker fires after every direct-children cascade pass. When EVERY named parent reaches done or skipped, reducer applies + reduce node marks done + recursive cascade fires for its own children. Three reducer kinds: concat (collect outputs in order, skipped parents contribute null), count (count of done parents + doneIds + skippedIds), first-truthy (first done parent with truthy output, using operator-mental-model truthy where empty arrays / objects / strings are FALSE). applyReducer is a pure function exported for unit tests. Missing-parent-id marks reduce failed with inline error. Skipped parents are valid terminal status — branch-deactivated paths don't block reduce completion.
- Shipped
29 new vitest fixtures for canvas-runner
apps/web/test/canvas-runner.test.ts covers branch predicate evaluator (truthy / eq / neq / nested paths / "output" special case / path-unresolved errors), BranchPredicateSchema regression (5 reject cases), reducer dispatch (concat / count / first-truthy with skipped-parent + empty-result variations), ReduceInputSchema regression. Caught a real bug: my evaluateBranchPredicate was using JavaScript Boolean() which treats [] and {} as truthy — operators authoring "did we find any documents?" predicates expected falsy. Unified isTruthy helper between branch evaluator + reducer for consistent operator-mental-model semantics. Web test count 44 → 73.
- Shipped
/demo gains a 5th moment: "the agent plans, not just executes"
New section describing the canvas auto-plan flow end-to-end: type a goal → click Auto-plan → Claude proposes a tree of pending nodes (search + branch + human-approve + tool-calls per branch path) → every node is pending until Advance → branch evaluator + reduce activator + audit-stream-per-run. Specifically calls out that the LLM cannot cause a side effect, only suggest one. Closes the four-moments narrative's "agent operates the system" claim with a workflow surface to actually demonstrate.
v0.7.62Canvas Phase 3 — cascade-after-advance + auto-plan from goal via Claude (D263-D264)- Shipped
Canvas auto-plan from natural-language goal (D264)
New "Auto-plan from goal (Claude)" button on /canvas/[id] when the run is planning with zero nodes. Claude Opus 4.6 receives the goal + the MCP TOOLS name+description catalog and returns a structured plan (3-10 nodes typical, 20 max) covering tool-call + human-approve + branch nodes. Zod-validated before DB writes; cross-reference validation ensures every parentRefId / trueChildRefId / falseChildRefId resolves to a refId in the same plan. Plan inserts as ALL-pending — the LLM cannot cause a side effect, only suggest one. Operator reviews + clicks Advance (Phase 2) or Approve to execute. canvas-run.auto-planned event lands on the per-run audit-chain stream with nodeCount + planSummary. Three failure paths surface inline: llm-failed, validation-error, no-active-tenant.
- Shipped
Canvas Phase 3 cascade-after-advance (D263)
After a tool-call node completes, cascadeAfterNodeCompleted finds direct children with pending status and: (a) flips human-approve children to waiting-on-human; (b) evaluates branch children against the parent output via the new evaluateBranchPredicate (3 operator kinds: truthy / eq / neq + dotted JSON path), activates matching subtree, recursively skips the other; (c) leaves tool-call children pending (Phase 2 operator-driven advance preserved); (d) leaves reduce children pending (Phase 4). Cascade also runs after decideCanvasNodeAction approves a human-approve gate. Branch failure marks the branch node failed with an inline error message. 3 new event types in @kumokodo/events: canvas-run.node-paused / .advanced / .auto-planned. Recursive skip walk handles deep subtrees. Idempotent — only pending children flip.
v0.7.61Conversational canvas runner Phase 2 + Cedar simulator pulls real audit samples (D261-D262)- Shipped
Canvas runner Phase 2 — single-tool-call advance (D261)
New apps/web/lib/canvas-runner.ts exports advanceCanvasNode — loads a pending tool-call node, resolves the tool by name from the MCP TOOLS catalog, parses the persisted input through the tool's Zod schema, invokes the handler with a ToolContext (actorKind: agent, actorId: runCreatorId), atomically updates the node row to done or failed, and emits canvas-run.node-completed / -failed events on the per-run audit-chain stream. New advanceCanvasNodeAction server action wraps the runner with auth + path revalidation. New "Advance — run {toolName}" button on /canvas/[id] for pending tool-call nodes. 7 explicit failure modes; failures mark the node failed with an error message inline so operators see what broke and can edit + retry. Phase 3 (recursive tree walk + branch / reduce evaluation) deferred — Phase 2 unlocks the demoable customer-call surface without committing to the harder runtime semantics until the schema is field-tested.
- Fixed
Cedar /policies/[id] Simulate now uses real recent audit samples (D262)
Pre-D262, clicking Simulate on /policies/[id] passed an empty sample array to the real Cedar engine — operators got back granted: [] denied: [] unchanged: []. The simulation was a shape-check, not an actual simulation. Now: pulls the last 50 write-side audit events from the last 30 days (same 9 event types the divergence cron tracks), maps each to its Cedar action, reconstructs (principal, action, resource, context) tuples, and feeds them to simulatePolicyAgainstSamples. Operators see "this policy would have ALLOWED N decisions and DENIED M decisions over the last month of real workspace activity" instead of an empty result. Graceful no-op for new / low-activity tenants. 50-event cap keeps Cedar evaluation latency subsecond.
v0.7.60P0 middleware fix (whole marketing surface was sign-in-gated) + connector test harness + distribution package (D258-D260)- Fixed
P0: every marketing surface was redirecting to sign-in (D260)
Caught in the live-app sanity sweep. apps/web/middleware.ts PUBLIC_PATHS only enumerated 4 paths (/, /sign-in, /pricing, /about); every marketing surface added since the (marketing) route group was created had been silently redirecting cold traffic to /sign-in. /security, /demo, /design-partner, /compare/*, /for-*, /legal/*, /security/*, /status, /changelog, /help, /policies/observability — all inaccessible to cold traffic until this fix. Rewrote the middleware with an explicit PUBLIC_PATHS list (16 paths) + PUBLIC_PREFIXES helper (10 prefixes). Comment header now mandates audit-after-shipment for any new marketing route. Eight commits of buyer-facing work suddenly visible.
- Shipped
Migration connector mock-API test harness (D259)
27 new vitest fixtures at packages/evals/src/migration-connector*.ts: 16 credentials-schema regression cases (protect every connector's secret-storage shape against accidental drift), 4 planned-connector contract tests (NetDocuments + FileHold throw not-implemented + isAvailable returns the right shape), 7 iManage end-to-end mock-API tests (probe + discover + download against simulated OAuth + paginated-list + content-endpoint responses). Global-fetch override pattern — no msw dependency. Deterministic synthetic bytes seeded by document id — no fixture binary content in repo. Test count up from 88 to 115; suite still runs sub-9-second with no DB / network setup.
- Shipped
Distribution package at docs/distribution/ (D258)
Content artifacts ready for the first design-partner contract trigger. Three-platform launch post (HackerNews / LinkedIn / Twitter) with platform-specific tone. Per-vertical cold-outreach templates for legal / AEC / accounting / QMS with Variant A (decision-maker) + Variant B (IT / operations) + follow-up cadence + personalization checklist + per-vertical anti-patterns. Prospect-list CSV scaffolding with 9 sample-row pattern. Social-share strategy with per-channel cadence + launch-day timeline + image specs. Written under non-launch pressure for higher-quality copy.
- Roadmap
Conversational canvas runner — deferred
Canvas Phase-1 (schema + manual node creation + decision flow + audit events) was already shipped at D216. The remaining piece is the agent-runner integration: auto-execute pending tool-call nodes by dispatching to the MCP TOOLS catalog. Sized at ~1-2 days of careful work (ToolContext construction inside the canvas server-action surface, atomic node-update transactions, error capture with revert-eligibility, eval coverage). Deferred to a focused session — better to ship it well than rush.
v0.7.59CLAUDE.md cleanup + migration CLI + /status + /security/bug-bounty + Object Lock for legal-held blobs (D254-D257)- Shipped
Object Lock / WORM for legally-held blobs (D254)
New applyObjectLockRetention helper at apps/web/lib/blob-object-lock.ts calls S3 PutObjectRetention with COMPLIANCE mode + 100-year horizon — non-removable even by AWS account root until retention elapses. New Inngest function legal-hold-object-lock listens on the existing event/appended fanout, filters to legal-hold.applied, looks up the document currentVersionHash, applies the lock asynchronously. Feature-flagged via OBJECT_LOCK_ENABLED=true; default off so existing buckets keep working (Object Lock requires bucket-creation-time enablement). Layered on the application-layer hold-deny-wins gate — defense-in-depth so even a side-channel S3 credential compromise can't rewrite or delete the underlying bytes for the hold duration.
- Shipped
/status operational status page (D255)
Public buyer-credibility surface: sub-processor status feeds for 12 vendors (split between customer-data-path and operational/connector dependencies), Kodori service-row health for app + REST + MCP + Inngest + audit-chain weekly verifier + Cedar divergence cron + connector sync + email ingest, incident history (empty until first real incident; framework documented per /security/policies). Subscribe-to-incidents email link as the substitute for a paid Statuspage subscription. Replaces the "status.kodori.ai (when live)" placeholders in the incident-response policy.
- Shipped
/security/bug-bounty pre-launch surface (D255)
Documents today's responsible-disclosure posture + payout-tier preview ranges (Sev 1: $1.5K-$5K; Sev 2: $500-$1.5K; Sev 3: $100-$500; Sev 4: $50-$100) anchored against comparable mid-market programs (Vercel / Linear / Anthropic). Pre-launch decisions catalog with 5 items (platform selection rubric open; invite-only-at-launch / platform-managed-escrow / scope-match / payout-benchmarks decided). Internal day-of-launch runbook at docs/security-policies/bug-bounty-runbook.md so the formal program activation alongside SOC 2 Type I is a 1-day operation.
- Shipped
Migration CLI at packages/migration/src/cli.ts (D256)
New `kodori-migrate` CLI exposing 4 commands: list (registered connectors with status + availability), probe (validate credentials + open a session, summarize workspace count + warnings), discover (walk metadata up to --max N, --json for piping), dry-run (discover + download for the first N docs, hash locally, report failures). Read-only — does NOT write to Kodori. Credentials via JSON file matching each connector's credentialsSchema. Use cases: pre-pilot probes before tenant provisioning, operator troubleshooting when the in-app /migrate flow fails, auditor evidence the connector code matches claimed behavior. Validates the iManage / NetDocuments / FileHold / s3-bucket connector contracts before the first design-partner pilot.
- Fixed
CLAUDE.md "Known holes" section reconciled with reality (D257)
The 6 Phase-0 gaps listed in CLAUDE.md (no migrations / no JIT user sync / no real ToolContext / no Inngest workflows / no tests / no MCP endpoint) are all resolved — migrations 0000-NNNN ship via vercel-build, upsertUserOnSignIn ships in apps/web/auth.ts, ToolContext constructed at every action entry, packages/workflow has ~20 functions, 88 evals, MCP endpoint mounted, Stripe checkout + portal + webhook live. New "Phase 0 holes — closed" section preserves the resolved-items list with ✓ marks (audit trail) + a separate "Open items" subsection lists actual remaining work. Five-surface-doc-sweep rule + triage discipline now prominent in the "When you make changes" section.
v0.7.58SOC 2 Type I prep document set — 10 internal policies + /security/policies + /security/responsible-disclosure (D253)- Shipped
10 internal security policies shipped as public Markdown
Information Security, Acceptable Use, Access Control, Change Management, Data Classification, Encryption, Incident Response, Vendor Management, Backup & Disaster Recovery, Risk Assessment. 1-2 pages each. Signed by Founder today; ownership-transition table on the README sets the path for when the team grows past one person. Markdown source at docs/security-policies/ in the public repo — git history IS the change log; auditors get a per-policy git log on request.
- Shipped
/security/policies public index surface
Public summary surface listing each of the 10 policies with a one-paragraph summary + cross-link to the GitHub source. For SOC 2 / HIPAA / 21 CFR Part 11 / ISO 27001 / FedRAMP security review — request the auditor-grade PDF set with version history via security@kumokodo.ai (under NDA + CAIQ-LITE questionnaire pre-filled).
- Shipped
/security/responsible-disclosure intake page
Explicit in-scope catalog (12 items: marketing surface, app surface, REST + MCP + SDK, auth flows, authz gates, audit chain, connector data flows, Cedar engine, webhook signatures, API key store, BYO-KMS flow). Explicit out-of-scope catalog (8 items: sub-processor infrastructure, best-practice violations without security impact, DDoS, social engineering, etc.). 4-step process (email security@kumokodo.ai with [disclosure] subject → 1-business-day acknowledge → 5-day triage → 30-day fix or remediation timeline → fix-then-acknowledgment cycle). Safe-harbor terms with 4 explicit conditions. No-monetary-payout pre-SOC-2-Type-I; formal bug-bounty program activates alongside the audit engagement.
- Shipped
/security pillar added linking both new pages
New "Internal security policy set + responsible-disclosure intake" pillar describes the 10-policy structure + the public-Markdown-in-repo posture + the responsible-disclosure intake. Header navigation row also expanded to surface the two new pages alongside Sub-processors.
- Shipped
Risk register at policy 10 with 13 scored risks
13 risks scored on 1-25 likelihood × impact with treatment plan. Top tier (10+): customer-data breach via deny-wins-gate bypass (mitigated via permission-trimmed-at-the-index pattern + eval suite + audit chain forensics); sub-processor outage > 4 hours (multi-region replication + atomic rollback). Mid tier (6-9): GDPR / HIPAA / 21 CFR Part 11 audit findings (mitigated via published mappings + DPO escalation); Cedar policy authoring divergence after authoritative flip (mitigated via 30-day soak window + manual flip behind security@kumokodo.ai). Low tier (1-5) accepted with annual review. Solo-founder bus factor scored explicitly at 20 with treatment plan covering code-in-public-ish-repo + customer-DPA-data-export-on-demand + every-architectural-decision-recorded.
v0.7.57First-customer-pilot acquisition surfaces — /pricing ROI + /demo + /design-partner (D252)- Shipped
/pricing ROI scenarios — three concrete migration math examples
New "ROI math" section between the tier grid and the cost-honesty section. Three scenarios with seat-count anchors and year-1 + 3-year cost comparisons: 25-attorney firm replacing iManage (~$51k saved over 3 years), 50-person engineering firm replacing NetDocuments (~$67.8k saved), 100-person QMS manufacturer replacing legacy DMS (material savings, custom incumbent quotes). Incumbent reference pricing from G2 / ITQlick / public RFPs. Explicit "what's NOT in these numbers" callout carves out soft costs (IT setup time, training, partner-time saved) that we don't want to fake-quantify but are the larger total-cost-of-ownership lever in practice.
- Shipped
/demo public showcase page
Auth-free, zero-JS-state, SEO-rankable cold-traffic landing. Four moments narrative: hybrid search across native + connector content; agent-as-substrate (not bolted on); tamper-evident audit chain demonstrable in one click; governance enforced at the database edge with Cedar shadow-mode layered on top. Four head-to-heads with incumbents (vs iManage Insight + ndMAX, vs every-legacy-DMS, vs incumbents-with-only-metadata-connectors, vs quote-based pricing). Sidebar cross-links to /security + /legal/* compliance docs for buyers running parallel security-review evaluation.
- Shipped
/design-partner program page
8-week pilot timeline (week 0 discovery → weeks 4-6 dual-write → week 8 go/no-go decision). Terms: 50% off Business tier annual ($32.50/seat/mo), 24-month price lock, named project sponsor + named technical lead, ~6-8 hours sponsor time + 20-30 hours technical-lead time across the pilot. Explicit "no logo requirement" — design partners DO NOT have to be public references; the discount + founder access is the trade for time and feedback, not marketing. Four areas where design-partner input changes the roadmap: vertical extraction parsers (RFI/CAPA/K-1), migration connector tuning (iManage / NetDocuments / FileHold), compliance evidence packets, Cedar policy starter sets.
- Shipped
Layout footer "Pilot" column added
New 5th footer column linking /demo and /design-partner. Two-link column keeps the footer scannable; future "case studies" or "customer references" links go here when real customer logos exist.
- Roadmap
iManage migration connector audited — production code, status: beta
packages/migration/src/connectors/imanage.ts holds real production-shape code: OAuth2 client-credentials auth, offset-paginated discover against /api/v2 with workspace + folder scope filters, content-endpoint streaming download with mime-type sniffing. Marked `status: 'beta'` because it has not yet been field-tested against a real customer iManage tenant — the design-partner program is the path to flipping that to `status: 'stable'` after the first migration runs clean.
v0.7.56Cedar shadow-mode observability — inline panel on /policies + dedicated /policies/observability dashboard (D251)- Shipped
Inline observation panel on /policies
Active-policy tenants land on /policies and immediately see a Shadow-mode observation panel: 24h / 7d / 30d divergence counts (tone-coded — green at zero, amber at low, red at high), soak-status badge ("Soak in progress · 18 days remaining" / "Soak ready · zero divergences ≥ 30 days" / "Soak counter reset"), days-since-last-divergence, and the most recent 3 divergences with cedar action + decision-vs-TS-gate inline. Link to the detail dashboard for the full 20-event list.
- Shipped
Dedicated /policies/observability dashboard
Full Cedar shadow-mode observability surface. Soak-status card with 4 distinct states (no active policies / soak ready / soak counter reset / soak in progress with N days remaining). Big number counts tone-coded by severity. Per-policy snapshot table showing each active policy + activation date + 30d divergence count + days-since-last. Recent-divergences list with full payload (cedar action + decision + principal type/id + resource type/id + policy-version + errors); each row cross-links to /audit?eventId={id} for chain investigation.
- Shipped
Manual flip behind security@kumokodo.ai email — auto-flip rejected
When the soak-ready badge fires (active policies > 0 + zero divergences in last 30 days + first active policy older than 30 days), the dashboard surfaces a "request the flip" call-to-action that opens an email to security@kumokodo.ai. We do NOT auto-flip — flipping Cedar to authoritative changes which authorization layer is source-of-truth, and operator confirmation is the load-bearing signal. Auto-flip on a single threshold would be brittle; a single mis-attributed divergence in week 31 would invalidate the soak unfairly.
- Fixed
/policies disclaimer reflects D250 shipment
Pre-D250, /policies carried an amber callout saying "live cedar-wasm runtime lands when a customer anchors ABAC requirements." That's now stale — D250 wired the real SDK + cron. Disclaimer rewritten to describe the actual posture (cedar-divergence-observation hourly cron + 30-day soak window for the eventual authoritative flip).
v0.7.55Cedar added to /security cert table + stale Phase-N compliance language swept across marketing- Fixed
Cedar policy engine now appears in /security cert table
D250 added Cedar as a /security pillar but missed adding it to the cert table that lists every "live today" technical control. Buyers reading the cert table got an incomplete picture. Cedar policy engine row now appears as `live today` alongside hash-chained audit, permission-trimmed retrieval, BYO-KMS, and other foundational controls. The pillar's help link also fixed (`/help/policies`, not the broken `/help/policy-engine`).
- Fixed
/security FAQ section had 6 stale Phase-N compliance claims
Sam called out: "common questions on the security page are still referencing phases." Confirmed: 6 FAQ answers (SOC 2 cert posture, BAA timing, encryption, deletion, security-issue reporting) plus one pillar (content-addressable storage) referenced "Phase 3" / "Phase 1" / "Phase 5" timing that was inconsistent with the v0.7.53 cert-status revamp. Each rewritten with the new vocabulary — `audit-pending`, `customer-anchored`, `sequenced`, or "live today" with cross-references to /legal/* docs and /encryption.
- Fixed
Compliance posture rows on /compare/* now match /security
/compare/imanage, /compare/netdocuments, /compare/filehold all had a "Compliance posture" row with stale "SOC 2 Type I in Phase 3, Type II in Phase 5" language that contradicted the v0.7.53 cert-status revamp. Buyer reading /compare side-by-side with /security got contradictory information. Each row rewritten with the cert-status vocabulary. Drop-in-replacement FAQ answers on /compare/imanage + /compare/netdocuments also updated to drop the Phase-N references.
- Fixed
/security/controls AICPA mapping page status labels updated
Status labels on the 36-control AICPA Trust Services Criteria mapping page replaced: `phase-1` → "Roadmap", `phase-3` → "On audit engagement". Status blurbs rewritten to match. The summary footer ("X live today / Y landing in Phase 1 / Z with the audit (Phase 3)") rewritten as "X live today / Y on the roadmap / Z activate on audit engagement". Two control-row notes (governance external-advisor, bug-bounty program) also updated to drop "Phase 3 / SOC 2 Type I" coupling — both now correctly tied to the SOC 2 Type I auditor engagement which is `audit-pending`.
- Fixed
/features cert tile language reconciled
The "SOC 2 controls mapping at /security/controls" tile said "each tagged Live / Phase 1 / Phase 3 (audit)" — drifted after the controls-page status-label rewrite. Now reads "Live today / Roadmap / On audit engagement" matching the controls page.
v0.7.54Cedar policy engine — divergence-observation increment shipped (D250)- Shipped
Real Cedar SDK wiring replaces the permissive stub
apps/web/lib/policy/cedar-runtime.ts now invokes @cedar-policy/cedar-authorization's CedarInlineAuthorizationEngine.isAuthorized(request, entities) instead of returning a hardcoded Allow. Lazy server-only dynamic import keeps the ~2-3 MB wasm engine out of the client bundle. Per-tenant engine cache keyed by ${tenantId}::${policyVersion} (where policyVersion = max activated_at across active policies) means engine construction is amortized — first call after deploy pays the import + parse + compile, subsequent calls hit the cache.
- Shipped
Default Kodori v1 Cedar schema bundled
apps/web/lib/policy/cedar-schema.json defines the Kodori namespace — User / Agent / System principals + Document / Collection resources + 9 standard actions (Read / Write / Delete / Share / ChangePermission / ChangeSensitivity / ChangeRetention / AddToHold / RemoveFromHold). Forward-compatible — future attribute additions are additive without breaking authored policies. Schema lives next to runtime code so schema + runtime changes land as one PR.
- Shipped
cedar-divergence-observation Inngest cron — hourly shadow-mode replay
New packages/workflow/src/functions/cedar-divergence-cron.ts function. Hourly cron, per-tenant-with-active-policies, replays the last hour of write-side audit events (document.tombstoned, document.classified, document.metadata-changed, document.legal-hold-applied, document.permission-changed, etc.) through Cedar. Capped at 500 events per tenant per run. When Cedar disagrees with the TS gate the action already passed, emits policy-engine.divergence with observedEventId + cedarAction + cedarDecision + tsDecision + principalKind + resourceId + policyVersion. The TS gates REMAIN authoritative — Cedar runs in observation mode building a 30+ day rolling divergence dataset for the eventual customer ABAC ask.
- Shipped
Real Cedar simulation at /policies/[id]
simulatePolicyAgainstSamples now runs real Cedar evaluation against supplied samples instead of returning placeholders. Engine-construction failures (invalid policy text, schema mismatch) surface immediately at simulation time — operators see "policy invalid" before activating instead of after a divergence event lands.
- Shipped
/security pillar added: "Cedar policy engine — shadow-mode with divergence observation"
Describes the posture: TS gates authoritative today, Cedar evaluates in parallel via the hourly cron, divergences land on the hash-chained audit log. After 30+ days of zero divergences across a tenant's active policies, the per-tenant cedar-authoritative flag flips Cedar to source-of-truth. A customer-side ABAC contract is now a 1-day flip rather than a 2-week integration.
- Roadmap
Why now: scale-cost asymmetry
Wiring Cedar pre-customer is materially cheaper than post-customer because (a) zero authored policies = free schema design freedom; (b) no entity-hydration calcification — the read-path code paths haven't locked in around assumptions Cedar would later need to undo; (c) bundle-size budget is unallocated; (d) observation telemetry is only useful BEFORE customers depend on the answer — every day Cedar runs in observation now is a free divergence data point. Sam authorized the increment after the D218a web-research reassessment confirmed the SDK uncertainty was lower than the original D218 deferral framed it.
v0.7.53Four compliance documents flipped from "phase-N" to "live today" — GDPR, 21 CFR Part 11, EU AI Act, SEC 17a-4 (D249)- Shipped
/legal/gdpr — Article-by-Article rights mapping
GDPR Articles 15-22 + UK-GDPR + CCPA / CPRA + PIPEDA / LGPD / POPIA notes. Each right maps to a concrete Kodori endpoint (Art. 15 access → /api/v1/users/me/data-export; Art. 17 erasure → tombstoneDocument + connector purge per D241; Art. 20 portability → JSON / blob exports). Lawful basis disclosed (contract per Art. 6(1)(b) + legitimate interest per Art. 6(1)(f) for sign-in). DPA template + DPO contact (dpo@kumokodo.ai) + EU SCCs Module 2 + UK Addendum framework documented.
- Shipped
/legal/21-cfr-part-11 — section-by-section conformance claim
Maps Kodori controls to FDA 21 CFR Part 11 Subpart B (§§11.10 a-k, 11.50, 11.70) + Subpart C (§§11.100, 11.200, 11.300). Hash-chained audit chain + verifyAuditChain MCP tool + weekly cron satisfy §11.10(a) (validation) + §11.10(e) (secure time-stamped audit trails). SSO + IdP-anchored MFA satisfies §11.200 (two distinct authentication components). Reversibility model + consequential-action confirmation gate satisfy §11.10(f) (operational sequencing). Validation evidence (audit-chain regression fixtures, 88 tests on every CI run) referenced.
- Shipped
/legal/ai-disclosure — EU AI Act Articles 11 / 12 / 14 / 50
Voluntary high-risk-style disclosure (Kodori is not classified as high-risk under Annex III, but regulated-segment buyers want this transparency in security review). Article 11 technical documentation: Anthropic Claude Opus 4.6 + Haiku 4.5 with zero-data-retention; 124-tool MCP catalog; OpenAI text-embedding-3-small for the semantic-search component. Article 12 logging: every agent action emits actorKind="agent" event on the hash-chained log. Article 14 human oversight: consequential-action confirmation gate + reversibility window + tool-call ceiling. Article 50 transparency: Agent / AI labeling + AI-summary callouts.
- Shipped
/legal/sec-17a-4 — audit-trail-alternative posture
Conformance claim against the November 2022 amendment to SEC Rule 17a-4(f) — broker-dealer recordkeeping. Maps hash-chained audit substrate to §§17a-4(f)(2)(i)-(iii) + 17a-4(f)(3)(i)(A)-(E). Content-addressable storage (SHA-256 IS the storage key) satisfies (f)(3)(i)(D) serialization requirement. Weekly verification cron satisfies (f)(3)(i)(C) automatic-completeness verification. Customer-firm representation letter + technical-architecture document available for FINRA filings via compliance@kumokodo.ai.
- Shipped
/security cert table — 4 new status enums replace misleading "phase-N"
"phase-N" labels conflated three distinct kinds of "not done": (a) substrate-ready-needs-customer-contract (HIPAA, ISO 27001) → new "customer-anchored" enum; (b) substrate-ready-needs-auditor-engagement (SOC 2 Type I) → new "audit-pending" enum; (c) sequenced-after-prior-cert (SOC 2 Type II) → new "sequenced" enum. Each status renders with a distinct color chip on the cert table. Eliminates the misread where buyers see "phase-3" and infer "they're halfway through a 6-phase plan" when actually we just need a customer to sign a contract that anchors the BAA / DPA / ISMS audit.
- Shipped
/security/subprocessors — Slack added + Google / Microsoft connector scope disclosed
D219+ added Slack Technologies as a new sub-processor — disclosed today with the explicit "operator-controlled scope expansion" framing (Slack uses bot tokens scoped to channels the operator invites the bot to). Google + Microsoft entries expanded to disclose connector data-flow scope (Gmail / Drive / Outlook / SharePoint / OneDrive operator-opt-in OAuth scopes) — pre-D219 the framing was "OAuth only" which was honest at the time but stale post-D219. Last-updated date bumped to 2026-05-02 — triggers the 30-day notice clock under the DPA for any signed-contract customer.
- Fixed
/for-law-firms + /for-accounting + /for-manufacturing-qms — stale Phase-N language replaced
Compliance FAQ language across three vertical pages had stale "lands in Phase 3" / "lands in Phase 5" claims for items that are now live today (GDPR, 21 CFR Part 11, EU AI Act, SEC 17a-4) or have shifted to customer-anchored (HIPAA, ISO). Each page now points buyers at the corresponding /legal/* doc for section-by-section evidence.
v0.7.52Five-surface marketing audit reconciled to the connector loop + Cedar SDK web-research reassessment (D248)- Fixed
Marketing surfaces now reflect the 6-vendor connector loop
Sam called out: "if /security was out of date, every other page probably is too." Confirmed. Pre-D248, nine marketing surfaces had zero mention of the D219-D245 connector loop: /, /pricing, /for-law-firms, /for-construction, /for-accounting, /for-manufacturing-qms, /compare/imanage, /compare/netdocuments, /compare/filehold. All nine now seed connector messaging — pillar tile on /, comparison row on /pricing, new feature card on each /for-* segment page (framed for the relevant audience: matter context for legal, RFI threads for AEC, client correspondence for accounting, supplier correspondence for QMS), and a new dimension row on every /compare/* page where Kodori shows ● and the legacy DMS shows —.
- Shipped
/legal/privacy reflects the connector OAuth flows + sub-processor expansion
Data-collection paragraph now enumerates the OAuth refresh-token-encryption posture (AES-256-GCM at rest) and the message / file metadata + extracted text scope. Sub-processor list expanded from "Google (OAuth only)" to the full enumeration: Google LLC for OAuth + Gmail + Drive, Microsoft Corporation for OAuth + Outlook + SharePoint + OneDrive, Slack Technologies LLC for the Slack connector. Explicit framing: "activated only by your explicit OAuth grant" + "revocation triggers a typed-confirmation purge" supports the GDPR Article 17 claim on /security.
- Shipped
Cedar SDK reassessment recorded as D218a
Sam asked whether open-web docs answer the questions the prior D218 deferral framed as unknown. Web research confirmed: @cedar-policy/cedar-authorization ships CedarInlineAuthorizationEngine with a clean Promise-based isAuthorized(request, entities) API — concrete AuthorizationRequest + Entity + AuthorizationResult shapes are documented. Reassessment: SDK swap is mechanical 1-2 day work; the genuine engineering remains entity hydration + schema authoring + bundle-size handling. The deferral conclusion HOLDS (no customer ABAC contract today) but the justification shifts from "SDK uncertainty" to "engineering scope absent a triggering customer ask." Recorded under D218 as sub-decision D218a.
v0.7.51/security page reconciled with the connector loop — 4 new pillars + 3 new cert rows + 1 new FAQ (D247)- Fixed
/security page no longer omits the entire connector security model
Pre-D247 the public /security page hadn't been touched since before D219 shipped. Buyers reviewing the page got an honest picture of the original substrate (hash-chained audit, deny-wins ACL, SSO-only, encryption at rest, tenant isolation) but ZERO mention of the OAuth-token-encryption posture, the connector-tenant-scoping invariant, the BYO-KMS-extension path, or the GDPR right-to-be-forgotten purge flow. That's a credibility hit for any security-review buyer who counts the gap.
- Shipped
4 new security pillars
External connector security (OAuth tokens encrypted at rest, tenant-scoped retrieval, no tokens / scopes / config returned by /api/v1/connectors). GDPR Article 17 connector content purge (typed-confirmation gate, audit-chain compliance row preserves WHO / WHAT / WHEN without preserving the deleted content). Connector text extraction posture (vendor stays source-of-truth, byte-streams NEVER mirrored as Kodori docs). Conversation export as compliance evidence (per-user permission gate, dual UI + curl-able REST paths, audit-event lands on both).
- Shipped
3 new cert rows + updated BYO-KMS row
External connectors row (Slack / M365 / Google Workspace, six vendors live). GDPR Article 17 right-to-be-forgotten row (live today via the connector purge flow). BYO-KMS row updated from "phase-3" to "live today" — orchestration + audit lifecycle ship today; per-vendor KMS SDK wiring drops in when a customer engagement anchors the cloud choice.
- Shipped
New "How are external connectors secured?" FAQ
Surfaces the technical posture buyers care about: AES-256-GCM application-layer token encryption, scrypt-derived key from AUTH_SECRET, envelope-extension to BYO-KMS, NEVER-returned-tokens posture on the API, per-vendor refresh-token handling differences (Microsoft rotates, Google forces consent on re-OAuth, Slack scope-controlled-via-channel-invitation), vendor-as-source-of-truth invariant, and the typed-confirmation purge flow.
v0.7.50Eval coverage for the connector loop — 29 new deterministic fixtures + RRF fusion math locked in (D246)- Shipped
Connector tool schema fixtures (19 cases)
New `connector-tool-schema-fixtures.ts` exercises the input Zod schemas of `searchExternalContent`, `unifiedSearch`, and `renameCollection`. Catches regressions like accidentally accepting `googleDrive` (camelCase) instead of the canonical `google-drive` (hyphenated wire format), bumping the limit ceiling without updating the SDK type, or loosening the UUID gate on collection ID. Pure-function — sub-millisecond, no DB, no network.
- Shipped
Reciprocal Rank Fusion math locked in with deterministic fixtures (10 cases)
New `rrf-fusion-fixtures.ts` extracts the RRF formula (k=60) into a pure `rrfFuse(lists)` helper + proves the canonical invariants: presence-in-multiple-lists beats single-rank-zero, single-list passthrough preserves order, empty input yields empty output, sourcesPresent count reflects contribution. The math underpins `hybridSearch` (D6), `searchExternalContent` (D226), and `unifiedSearch` (D238) — all three call sites now have a single canonical implementation that future tools can opt into.
- Improved
Test count: 88 passed in 4.7s (up from 59 pre-D246)
Connector loop now has formal regression coverage. Eval suite still runs sub-5-second with no DB / network setup — keeps the dev-loop tight while paying down the tech debt that accumulated across 21 releases of D222-D245.
v0.7.49Saved-search alerts now fire on connector messages (Slack / Gmail / Outlook) — D245- Shipped
Existing saved-search alerts ALSO fire on Slack / Gmail / Outlook arrivals
No new table, no UI changes, no separate scope-onboarding. Operators who set up "email me when X arrives" alerts (D154) now get those emails for matches in newly-synced connector messages too. Sync function fires a new `external-message/indexed` Inngest event with the batched messageIds; the new `external-search-alerts-dispatcher` runs each tenant's active alerts against the new messages via the existing `external_messages_fts_idx` GIN expression index.
- Shipped
Email CTA links to the vendor permalink (Slack thread / Outlook web view)
When a saved-search alert fires for a connector match, the "Open in vendor →" button on the email lands the operator directly on the Slack thread URL or Gmail web view or Outlook permalink — not /doc/<id> like Kodori-doc alerts. New optional `documentUrl` parameter on `sendSavedSearchAlertEmail` makes the override; existing Kodori-doc alerts continue to render "Open the document →" pointing at /doc/<id> with no behavior change.
- Shipped
`saved-search-alert.fired` audit event payload extended for connector matches
Carries `externalMessageId` + `connectorId` + `connectorKind` instead of `documentId` when the match was a connector message. /audit?eventType=saved-search-alert.fired now answers "what alerts fired this quarter" across both surfaces in one query — compliance-friendly without separate event types.
- Improved
Stack-decisions D245 + scope §15 entry land alongside the code
D245 documents the reuse-existing-table-vs-surface-column-vs-separate-table decision (operator mental model is "alert me when X arrives" regardless of source), the batched-event-vs-per-message decision (Inngest queue economy), the v1-skips-document-extraction-events deferral with documented revisit trigger, the tenant-scope vs per-user-ACL posture (parallels D226), and the unified-email-template-with-optional-vendor-URL pattern (avoids doc-rot from two near-identical templates).
v0.7.48Connector extraction cost telemetry — surprise-bill prevention for big-Drive backfills (D244)- Shipped
External-document extraction now lands in cost_events
Pre-D244, connector text extraction was the one Anthropic-billable code path that didn't hit the cost-tracker. A tenant connecting a 50K-doc Drive could rack up surprise vision-API bills with no /costs visibility. The extract-external-document Inngest function now calls trackPdfExtraction post-extract via a new optional `trackExtractionCost` dependency. Approximate token count derived from extracted-text length (1 token ≈ 4 chars). Fire-and-forget — never blocks the host extract on a tracking failure.
- Shipped
Per-extractor ref tag — telemetry distinguishes vendor + extractor
Each cost row tags as `external:<vendorKind>:<extractor>` (e.g. `external:sharepoint:claude-pdf`, `external:google-drive:google-drive-export`). Lets the /costs dashboard answer "which connector + extractor combination is the most expensive this month?" without per-tenant query plumbing. Non-LLM extractors (Office adapters, builtin-text, google-drive-export) record zero microcents but still get rows for usage telemetry.
- Improved
Stack-decisions D244 + scope §15 entry land alongside the code
D244 documents the dependency-injection vs direct-import choice (workflow package stays free of apps/web cost-tracker imports), the byte-size → token-count approximation rationale (extracted text is the right denominator, not byte size, because byte size includes binary headers), the fire-and-forget pattern (parallels D175 cost-tracker posture), and the per-extractor-ref tag pattern (enables /costs cohort analysis without per-call instrumentation).
v0.7.47Compliance dashboard connector roll-up + bulk ops on /integrations (D242 + D243)- Shipped
Connector content panel on /compliance
New 4-stat panel (Active connectors / Synced messages / Synced documents / Text-extracted) renders alongside the existing live-records / legal-holds / retention-review block when any connector content exists. Amber accent on text-extracted when partial (extraction-in-progress is expected, not broken). Link to /integrations for per-connector detail. Closes the "compliance officer can't see the full data footprint at a glance" gap.
- Shipped
Bulk connector ops — Sync all / Pause all / Resume all on /integrations
Three top-of-page buttons with per-status counts inline ("Sync all connected (4)", "Pause all (4)", "Resume all paused (2)"). Sync-all caps at 50 per click (parallels D234 retry-button cap=500 — protects Inngest dispatch budget on tenants with many connectors); click again for more if cap hit. Pause-all gates with confirm dialog; Sync-all + Resume-all are additive so no gate needed. Each affected connector emits its own `external-connector.paused` / `.resumed` event with `bulk: true` payload flag — preserves the "one event per state change" audit invariant while letting compliance queries discriminate bulk vs single-row clicks. Use cases: post-outage recovery + planned maintenance windows.
- Improved
Stack-decisions D242 + D243 + scope §15 entry land alongside the code
D242 documents the same-page-vs-sub-page decision + aggregate-counts-vs-per-vendor-breakdown choice + amber-on-partial-extraction tone rationale. D243 documents the sweep-buttons-vs-multi-select decision (covers 80% of bulk-action use cases without selection-state machinery), the 50-row cap with click-again posture, the confirm-on-pause-only UX choice, and the per-connector-event vs single-bulk-event audit rationale.
v0.7.46Connector content purge — GDPR right-to-be-forgotten path with typed-confirmation gate (D241)- Shipped
New "Permanently delete synced content" admin action on /integrations/[id]
Visible only on revoked-status connectors. Two-stage gate: revoke first (acknowledges disconnect intent), then type the connector display name EXACTLY (case-sensitive + trim) to confirm deletion intent. Same typed-name pattern as GitHub repo deletion + Postgres dump-restore confirmations — forces the operator to look at WHICH connector they're about to purge.
- Shipped
New external-connector.content-purged audit event
Lands on the tenant stream with payload { connectorId, kind, displayName, externalAccountId, messagesPurged, documentsPurged }. Compliance evidence preserves WHO purged WHAT WHEN without preserving the content itself (which would defeat the deletion intent). The DELETE returning { id } gives us the row counts naturally — no extra round-trip.
- Shipped
Use case: GDPR Article 17 right-to-be-forgotten requests in 2 clicks + 1 type
Pre-D241 a customer asking "delete everything Kodori synced from my Slack" required hand-running SQL DELETE statements + manually documenting it on the audit chain. Now: revoke the connector → scroll to the destructive-action panel → type the connector display name → click "Permanently delete." Audit row lands with the operator id + counts. The whole flow takes < 60 seconds + leaves a clean compliance trail.
- Improved
Stack-decisions D241 + scope §15 entry land alongside the code
D241 documents the typed-confirmation-vs-checkbox decision (look-at-it-before-you-delete UX), the revoked-only precondition (separates disconnect intent from deletion intent), the aggregate-counts-only audit posture (per-row IDs would themselves be PII), and the no-soft-delete decision (defeating the deletion intent vs preserving compliance flexibility — picked deletion).
v0.7.45/api/v1/search/unified REST endpoint + SDK 0.2.1 — public surface for cross-source RRF (D240)- Shipped
New /api/v1/search/unified — promotes unifiedSearch MCP tool to public REST
D238 shipped unifiedSearch as an MCP tool with the deferral note "no public REST until SDK consumers ask." Revisited that on cost-benefit: the route is a thin wrapper, dashboards building cross-vendor search would re-implement RRF client-side otherwise, and the agent's mental model already has a single canonical "find everything" path — public REST should match. Body: `{ query, limit? }`; same `search:read` scope as the other two search endpoints. Returns the same RRF-fused cross-source ranked list the agent gets via MCP.
- Shipped
@kumokodo/kodori-sdk 0.2.1 — kodori.unifiedSearch.run({ query, limit })
New UnifiedSearchNamespace + 2 new exported types (UnifiedHit, UnifiedSearchResponse). Each hit carries kind ("document" | "external-message" | "external-document"), vendor ("kodori" | one of 6 connector kinds), score, source ("keyword" | "semantic" | "both"), snippet, and url. Patch bump (not minor) — this surface was pre-announced in D238's deferral note, so 0.2.1 tracks the implementation rather than the API design.
- Shipped
OpenAPI 3.1 manifest + /me discovery list both updated
Both surfaces now advertise three search entry points: /search (Kodori docs only), /search/external (connectors only), /search/unified (cross-source). Permissive items schema on the unified response keeps the manifest forward-compatible while ranking improvements shape the response details.
- Improved
Stack-decisions D240 + scope §15 entry land alongside the code
D240 documents the deferral revisit (cost-benefit flip vs continued waiting), the patch-vs-minor bump rationale (this surface was pre-announced so version tracks impl), the permissive OpenAPI schema posture (parallels D237), and the per-source-filter / weight-knob revisit triggers.
v0.7.44Agent conversation export → compliance evidence path with audit event + curl-able REST route (D239)- Shipped
New /api/agent/conversations/[id]/export?format=md|txt route
Curl-able programmatic export path for compliance archives. Returns the full conversation transcript including turn-by-turn role / content + tool-call summaries. Per-user permission gate (only the conversation OWNER exports — tenant admins read agent activity via the existing per-tool audit events). Dual format: markdown default, plain text override.
- Fixed
exportAgentConversationAction now fires agent-conversation.exported audit event
The existing server action (wired to the "Download" button in the agent drawer) was missing audit-chain coverage — operators downloading conversations left no chain entry, so "who exported what when" was invisible to compliance queries. Now both the UI download button AND the new REST route emit the audit event. Best-effort insert (try/catch) so a transient DB issue on the audit side never blocks the download.
- Shipped
New `agent-conversation.exported` event type + /audit chip catalog entry
Lands on the user's `user/<id>/agent-conversations` stream with payload `{ conversationId, format, messageCount }`. /audit chip catalog adds the new type under the existing connectors-and-digests group so admins can filter "every agent-conversation export this quarter" with one click.
- Improved
Stack-decisions D239 + scope §15 entry land alongside the code
D239 documents the dual-path posture (server action for UI + REST route for compliance scripts), the per-user permission gate (parallels D139), the markdown-default + plain-text-override choice, and the best-effort audit-insert pattern (parallels D175 cost-tracker posture).
v0.7.43unifiedSearch — single MCP tool fuses Kodori-doc + connector hits via Reciprocal Rank Fusion (D238)- Shipped
New unifiedSearch MCP tool — cross-source ranked retrieval in one call
Fires hybridSearch (Kodori-native docs) + searchExternalContent (connector content) in parallel. Fuses the three resulting ranked lists (Kodori-document + external-message + external-document) via RRF k=60 — same constant as hybridSearch internally, so cross-source ranking stays calibrated with single-source results. Per-source limit of 30 → 60 total candidates fused → top-K final output (default 20, max 100). Each hit carries a `kind` discriminator and `vendor` tag so the agent can group results by source when the user wants source-awareness, but the ranking-by-best-overall is the more useful default.
- Improved
Agent prompt now reaches for unifiedSearch on broad cross-source queries
Pre-D238 the agent had to make two MCP calls (hybridSearch + searchExternalContent) and reason about merging in its answer. Now: a "find every relevant thing about the Brennan matter" prompt resolves to one unifiedSearch call. Prompt explicitly distinguishes when to use unifiedSearch (source-agnostic) vs hybridSearch (Kodori-only) vs searchExternalContent (connectors-only) so source-explicit queries still hit the cheaper single-leg path.
- Fixed
Cedar SDK wiring (D218) — DEFERRED with documented rationale
Verified `@cedar-policy/cedar-wasm` v3.2.0 exists on npm (via Context7 docs lookup), but the JS-side `isAuthorized` signature isn't fully documented in the package README + Next.js 15 wasm bundling needs `webpack.experiments.asyncWebAssembly: true` config that could regress the existing build. Wiring without runtime testing on a real /policies surface is a real risk. The current "Allow always" stub keeps TS gates authoritative with zero functional regression, so deferral is safe. Will revisit when (a) a customer asks for ABAC enforcement or (b) we have a sandboxed runtime to test the JS bindings against.
- Improved
Stack-decisions D238 + scope §15 entry land alongside the code
D238 documents the new-tool-vs-kind-filter decision, the per-source-30 / RRF-k=60 calibration rationale, the across-three-rankings-as-one-union vs per-kind-then-merge fusion choice, and the no-REST-endpoint-yet posture (SDK consumers merge client-side until proven worth a third public search surface).
v0.7.42Public REST + TypeScript SDK 0.2.0 expose connector content — programmatic access without /api/mcp (D237)- Shipped
POST /api/v1/search/external — public REST surface for searchExternalContent
Mirrors `/api/v1/search` shape: query + optional kind filter (slack | gmail | outlook | sharepoint | onedrive | google-drive) + limit; returns `{ messages, documents }` with snippets + vendor URLs. Tenant-scoped (only authorized connectors contribute). Required scope: `search:read` (baseline — no new scope to onboard). Same FTS + pgvector + Reciprocal Rank Fusion retrieval as the internal `searchExternalContentTool`.
- Shipped
GET /api/v1/connectors — read-only listing with content + extraction counts inline
Lists every external connector configured for the tenant with status / kind / displayName / lastSync timestamps + aggregate content counts (messageCount / documentCount / extractedCount / extractionFailedCount) computed via correlated subqueries (sub-millisecond at typical tenant scale, single round-trip). Returns NO tokens, NO scope strings, NO config payloads — admin UI at /integrations is the path for those. Required scope: `search:read`.
- Shipped
@kumokodo/kodori-sdk 0.2.0 — two new namespaces, 11 new types
New `kodori.externalSearch.run({ query, kind?, limit? })` and `kodori.connectors.list()` methods. New types: ConnectorKind, ConnectorStatus, ConnectorSummary, ConnectorContentSummary, ConnectorListResponse, ExternalMessageHit, ExternalDocumentHit, ExternalSearchResponse. Zero breaking changes from 0.1.1 — existing consumers don't need to change anything. Minor version bump (semver) reflects net-new public surface without API contract changes.
- Shipped
OpenAPI 3.1 manifest at /api/openapi.json includes both new endpoints
Drop into Postman, Insomnia, or Stoplight Studio for typed request building. Permissive `items: { type: object }` schema definitions on the new responses keep the manifest forward-compatible while ranking improvements (semantic-rerank, better RRF tuning) shape the response details. /api/v1/me identity-probe endpoint discovery list extended to advertise both new endpoints.
- Improved
Stack-decisions D237 + scope §15 entry land alongside the code
D237 documents the search:read-baseline-scope vs new-scope-axis decision (no new onboarding friction) + correlated-subqueries vs separate-stats-endpoint posture + two-namespaces vs one-method-with-kind decision + 0.2.0 minor bump rationale + permissive OpenAPI schema posture.
v0.7.41Agent prompt teaches renameCollection — closes the Roy-flagged tool that shipped without prompt guidance- Fixed
Agent now reaches for renameCollection on "rename matter / collection" prompts
v0.7.31.1 shipped renameCollection (the Roy-flagged gap fix) but the agent's system prompt only documented renameDocument. Result: a "rename the Smith matter to Smith v Acme NDA" prompt would either match a single document or get fumbled. New prompt section now explicitly distinguishes renameCollection (collection-targeted, preserves membership + rules + ACL) from renameDocument (document-targeted, preserves bytes + hash), with the "rename the matter / collection" natural-language mapping. Owner / admin or creator only — same blast-radius posture documented.
v0.7.40Agent system prompt teaches searchExternalContent — closes the connector-aware-retrieval gap (D236)- Fixed
Agent prompt now describes connector content + when to use searchExternalContent
Pre-D236 the agent had searchExternalContent in its tool catalog (since v0.7.32) but the system prompt never taught it WHEN to reach for it. Result: a "search Slack for the Brennan matter" prompt would default to hybridSearch over Kodori-native docs and miss every Slack message. The new Connector content section in `packages/agent/src/prompt.ts` documents the tool, gives natural-language → tool-call mappings ("search Slack...", "find that contract Bob emailed", "what's in SharePoint about..."), and explicitly frames searchExternalContent as COMPLEMENTARY to hybridSearch (not a replacement) so broad questions fire both in parallel.
- Improved
Citation guidance: cite the vendor permalink URL alongside the source
Connector hits return a `url` field linking back to the original (Slack thread URL, Gmail web view, Drive webViewLink, etc.). The prompt now requires the agent to cite those URLs in answers — operators want one click back to the original vendor view, not just the snippet quote.
- Improved
Tenant-scoped behavior documented — no false-positive empty results
Prompt now teaches: "If the user asks about a vendor with no connector authorized, say so explicitly rather than returning empty results." Pre-D236 a query like "search our Drive for the Q3 deck" against a tenant with no Drive connector would silently return an empty list — confusing operators who didn't realize the connector wasn't set up.
v0.7.39Re-authorize button on /integrations/[id] — clean operator-paced migration path for additive scope changes (D235)- Shipped
Re-authorize button — picks up new scopes without revoke + reconnect
New Re-authorize affordance on /integrations/[id] for connected and paused connectors. Routes to the per-kind OAuth start URL (`/api/oauth/connect/slack` for Slack, `/api/oauth/connect/microsoft?kind=<kind>` for the MS trio, `/api/oauth/connect/google?kind=<kind>` for Gmail/Drive). Re-OAuth refreshes tokens AND grants new scopes via the existing `createConnectorAction.onConflictDoUpdate` path — sync state, audit chain, indexed content all stay intact. The clean migration path for additive scope changes (e.g. Slack `files:read` from v0.7.36 lands without forcing every customer through revoke + reconnect).
- Shipped
Scope-outdated hint — auto-detects when a re-auth is worth the click
When a connector's stored scopes don't include the latest expected scope for its kind, the page renders an amber callout above the status panel: "Scope upgrade available. Click Re-authorize to grant the additional scope — your existing sync state, audit chain, and indexed content stay intact." First triggered for Slack connectors authorized pre-D232 (no `files:read`). Forward-compatible: adding new scopes per kind in the future just updates the per-kind expectedLatestScope mapping.
- Improved
Hidden on revoked connectors — re-activation flow stays clean via /integrations Connect
Revoked connectors don't show Re-authorize. Operators reactivate via the regular Connect button on /integrations, which preserves the explicit-decision UX ("I revoked this; I want to start fresh") rather than a silent un-revoke from the detail page.
v0.7.38Manual "Retry failed" button on /integrations/[id] — operators triage extraction failures without waiting on the 6h cron- Shipped
New `retryFailedExtractionsAction` server action
Owner / admin only. Selects every external_documents row for the connector with `extraction_error IS NOT NULL`, fan-outs `external-document/extract.requested` Inngest events. Capped at 500 rows per click; if more remain, the button reports `Retried 500 (cap hit; click again for more)` so operators know to keep going.
- Shipped
Retry button inline on the extraction status panel
Appears next to the failed-count pill on /integrations/[id] when `extractionStats.failed > 0`. Disappears automatically when there are no failures. Works for any document-bearing connector (SharePoint, OneDrive, Drive, Slack files, Outlook attachments, Gmail attachments). Refreshes the page after dispatch so the new pending count reflects on next visit.
- Improved
Use case: post-outage recovery in 1 click
Anthropic 503 storm leaves 200 documents stuck with `extraction_error`. Pre-D234 you wait up to 6h for the retry-cron. Now: click Retry failed → 200 events fire → most extract within 30s. The 6h cron remains the long-tail safety net for outages that happen overnight or for failures the manual click missed.
v0.7.37Outlook + Gmail attachments → external_documents — message-vendor attachment story complete (D233)- Shipped
Outlook attachments — Graph fan-out per message-with-attachments
outlookSyncWorker now detects hasAttachments=true messages and fans out per-message `/messages/{id}/attachments?$select=id,name,contentType,size,isInline` calls (50-message cap per run). Filters to fileAttachment kind only — skips inline images (rendered in body), itemAttachment (forwarded-email recursion), and referenceAttachment (those land via the OneDrive/SharePoint connector instead, avoiding dedup). Compound externalId `<messageId>:<attachmentId>`.
- Shipped
Gmail attachments — walk payload.parts for body.attachmentId
gmailSyncWorker extended to traverse payload.parts looking for parts with `body.attachmentId` set (the marker for "this is a binary attachment, fetch separately"). No extra API calls — `users.messages.get?format=full` already returns the attachment IDs in the payload tree. Filename extraction prefers `payload.filename`, falls back to Content-Disposition header parsing.
- Shipped
Outlook + Gmail byte-fetchers
`fetchOutlookAttachmentBytes` uses Graph `/me/messages/{messageId}/attachments/{attachmentId}/$value` — raw-bytes endpoint, no decode step. `fetchGmailAttachmentBytes` uses `users.messages.attachments.get` (JSON envelope) + local base64url decode (Gmail has no raw-bytes endpoint). Both honor the 50MB byte cap matching Kodori upload limits. Both wired into the extract dispatch alongside SP/OD/Drive/Slack.
- Shipped
All 6 connectors now emit document rows for file content
After D232 (Slack files) + D233 (Outlook + Gmail attachments), every connector kind that can carry attached files surfaces them in external_documents. The extraction pipeline (D229) + retry sweep (D231) + status panel (D230) all work uniformly across vendors. Cross-vendor agent search via searchExternalContent answers "find every contract attached anywhere this quarter" with one MCP call.
- Improved
Stack-decisions D233 + scope §15 entry land alongside the code
D233 documents the per-message fan-out vs $expand decision + fileAttachment filtering rationale (skip itemAttachment / referenceAttachment) + isInline skip + compound externalId pattern + per-vendor byte-fetcher endpoint differences (Outlook $value vs Gmail JSON envelope).
v0.7.36Slack file attachments — uploads in Slack channels now flow into external_documents and become extractable + searchable (D232)- Shipped
Slack files → external_documents — extends slackSyncWorker
Migration 0087 relaxes the external_documents vendor_kind CHECK constraint to allow message-vendor attachments. slackSyncWorker now walks `files.list?ts_from=<cursor>` per sync run with a 200-file-per-run cap; new cursor `slack-files-ts-from` is the Unix timestamp of the highest-seen file create time. Files persist with `vendorKind=slack` + raw jsonb carrying the `urlPrivateDownload` URL.
- Shipped
Additive `files:read` scope on Slack OAuth — existing connectors keep working
Adding `files:read` to SLACK_SCOPES is a graceful change: existing connectors without the scope hit `missing_scope` on `files.list`, the worker catches the error and skips file sync (messages still work). Customers who want files click "Re-authorize" on /integrations to re-OAuth with the additional scope.
- Shipped
Slack byte-fetcher + extract dispatch wiring
New `fetchSlackFileBytes` helper at apps/web/lib/connectors/extract/slack.ts: GET `urlPrivateDownload` with Bearer bot-token, 50MB byte cap matching Kodori upload limits. Wired into the extract function dispatch alongside SP/OD/Drive. The existing extractor cascade (Azure DocIntel → Office adapters → Whisper → DocAI → Claude PDF → builtin-text) handles whatever bytes land — a Slack-shared PDF, .docx, image, audio note, or anything else.
- Shipped
Extract gate loosened — fire on any newly-inserted document row
Sync orchestrator no longer checks a per-kind allowlist before firing extract events. Now fires whenever `newlyInsertedDocs.length > 0`. Message-pure connectors emit zero document rows from their workers so the gate is naturally empty for them; future kinds slot in without per-kind allowlist edits.
- Improved
/integrations/[id] handles dual-shape connectors (messages + documents)
Doc-section gating updated so Slack shows BOTH the messages list AND a documents list when files are present. Extraction status panel ungated from kind so Slack file extraction stats appear there too. Same UI for any future dual-shape connector.
- Improved
Stack-decisions D232 + scope §15 entry land alongside the code
D232 documents the extend-the-worker-vs-new-kind decision + additive-scope migration path + per-file vs per-channel cursor + url_private_download from raw vs separate files.info. Five-surface doc sweep posture preserved.
v0.7.35Connector production hardening — per-connector cadence override, extraction retry cron, extraction status panel (D230 + D231)- Shipped
Per-connector sync cadence override — 5 min to 24 hours, per row
Migration 0086 adds `sync_interval_minutes` to external_connectors with CHECK 5..1440 (or NULL for the global 30-min default). The recurring sync cron honors the per-row override via cutoff = cadence-5min so a 5-min cadence connector syncs every tick while a 24h archive syncs once a day. UI on /integrations/[id] presents 7 presets (Default / 5 min / 15 min / 30 min / 1h / 4h / 24h). New `external-connector.cadence-set` audit event lands on the hash-chained log when admins change a value.
- Shipped
Extraction retry sweep — recovers from transient extractor outages
Separate Inngest cron at `0 */6 * * *` selects `external_documents` WHERE `extracted_at IS NULL AND (extraction_error IS NOT NULL OR created_at < now-1h)` with parent connector still connected, capped at 200 rows per tick. Filters to sharepoint/onedrive/google-drive (the document-bearing kinds; messages don't need extraction). Re-fires `external-document/extract.requested` events. Permanent failures (oversize, unsupported mime, 404) still stick to the row but get one cheap retry every 6h before stopping; transient outages (Anthropic 503, Azure rate-limit storm, Inngest dispatch loss) recover within 1-4 retry shots.
- Shipped
Extraction status panel on /integrations/[id]
For document-kind connectors (SharePoint, OneDrive, Google Drive), the detail page now shows extracted/pending/failed/total counts in color-coded pills (emerald/neutral/amber). Backed by `count(*) filter (where ...)::int` aggregations — single query, sub-millisecond at scale. Hidden when no documents have synced yet. Pending counter footnotes the 6h retry-cron cadence so admins know when to expect recovery.
- Improved
Stack-decisions D230 + D231 + scope §15 + new audit chip lands alongside the code
D230 documents the JS-side filter vs in-SQL CASE choice + 5/1440 clamp + cadence-set event. D231 documents the 6h cadence + select-stale-rows + 200-row cap. /audit page chip catalog adds `external-connector.cadence-set` to the connector group.
v0.7.34Connector text extraction (D229) — SharePoint/OneDrive/Drive files become first-class in retrieval- Shipped
External-document text extraction — sync fires extract events for new files
Closes the biggest remaining quality gap in connector retrieval. Pre-D229, file-as-document sync persisted SharePoint/OneDrive/Drive rows with `text: null` — searchExternalContent FTS over `(name, text)` only matched filenames, missing body content. Now: sync function captures newly-inserted document IDs via `onConflictDoNothing().returning(...)` and fires per-document `external-document/extract.requested` events. New `extract-external-document` Inngest function (concurrency.key=documentId) consumes the events, downloads bytes via vendor-specific helpers, runs the existing extractor cascade (Azure Doc Intel → Office → illustrator-ai → Whisper → Google DocAI → Claude PDF → builtin-text), and UPDATEs `external_documents.text` + `extracted_at`.
- Shipped
Per-vendor byte fetchers — SharePoint, OneDrive, Google Drive (binary + native)
SharePoint via Graph `/drives/{driveId}/items/{itemId}/content` (the externalId we stored is `<driveId>/<itemId>` per D223 to disambiguate cross-site files). OneDrive via Graph `/me/drive/items/{id}/content`. Google Drive splits paths: native types (Docs/Sheets/Slides) export as text via `/files/{id}/export?mimeType=text/plain`; everything else fetches raw bytes via `/files/{id}?alt=media`. 50MB byte cap matches the Kodori upload limit; 2MB stored-text cap (covers ~500 pages dense prose) keeps storage bounded.
- Shipped
Migration 0085 — `extracted_at` + `extraction_error` columns on external_documents
`extracted_at IS NULL AND extraction_error IS NULL` is the "haven't tried" sentinel; `extraction_error IS NOT NULL` is the "failed, retry candidate" state. Partial index `WHERE extracted_at IS NULL` makes the retry-cron query cheap even on large tenants. Failures persist for triage; transient failures get retried by Inngest's step-retry; permanent failures (oversize, unsupported mime, 404) stick on the row.
- Shipped
Slack / Gmail / Outlook excluded by design — message kinds already carry text
The orchestrator dispatches extract events ONLY for sharepoint / onedrive / google-drive. Message kinds (slack, gmail, outlook) already carry plain-text body in the messages worker output (Gmail HTML stripped, Slack plain by default, Outlook HTML stripped). Firing extract events for messages would be a no-op that wastes Inngest dispatches.
- Improved
Stack-decisions D229 + scope §15 entry land alongside the code
D229 documents the sync-fires-extract event chain reasoning + onConflictDoNothing-returning capture pattern + per-vendor byte fetcher rationale + storage caps. Five-surface doc sweep posture preserved.
v0.7.33Connector loop completion: recurring sync cron + FTS+semantic search upgrade + Gmail + Google Drive (6/6 vendors live) + tool-count accuracy fix (D225-D228)- Shipped
Recurring sync cron — every 30 min, every connected connector stays fresh
Single Inngest cron at `*/30 * * * *` selects connectors with `status=connected` AND `lastSyncCompletedAt < now-25min` and fans out one sync event per row (capped at 200 per tick). The downstream `external-connector-sync` function is concurrency-keyed on connectorId so a manual "Sync now" mid-tick + the cron tick serialize cleanly per connector but parallelize across connectors. No operator action needed — your Slack/Outlook/SharePoint/OneDrive/Gmail/Drive content stays current automatically.
- Shipped
searchExternalContent — FTS + pgvector + RRF, mirrors hybridSearch quality
Connector content becomes first-class in retrieval. Migration 0084 adds GIN expression indexes on `setweight(subject,A) || setweight(body,B)` for messages and `setweight(name,A) || setweight(text,B)` for documents. The tool now combines Postgres FTS (`websearch_to_tsquery` + `ts_rank_cd` + `ts_headline` snippets) with pgvector cosine similarity over the existing HNSW indexes via Reciprocal Rank Fusion (k=60) — same shape as hybridSearch over Kodori documents. Output adds `score` + `source` ('keyword'|'semantic'|'both') so the agent can reason about hit strength. Query embedding via OpenAI text-embedding-3-small (graceful fallback to FTS-only when `OPENAI_API_KEY` unset).
- Shipped
Gmail + Google Drive connectors — 6/6 vendor coverage
Single shared OAuth flow at `/api/oauth/{connect,callback}/google?kind=gmail|google-drive`. State value `<nonce>.google:<kind>` carries the vendor through the round-trip — same pattern as the Microsoft Graph trio. Env fallback to `AUTH_GOOGLE_ID/SECRET` when `GOOGLE_OAUTH_*` unset (cuts activation friction in half for sign-in-with-Google deployments). `prompt=consent` forces refresh-token re-issuance so re-Connect after a revoke doesn't silently break sync. `gmailSyncWorker` walks `users.history.list` (delta) + `users.messages.get` per-message with HTML body stripping. `googleDriveSyncWorker` walks `changes.list` (delta) + bootstraps via `changes.getStartPageToken` + `files.list orderBy modifiedTime desc`. All 6 connector kinds (Slack, Gmail, Outlook, SharePoint, OneDrive, Google Drive) now ship E2E.
- Fixed
MCP tool count accuracy — `120+ tools` → `75+ tools` across all surfaces
Marketing copy across /, /compare/{imanage,netdocuments,filehold}, /features, and 2 help articles claimed `120+ tools` — inflated by ~50%. Actual count is 76 static + 2 factory-built = 78 total. New `MCP_TOOL_COUNT_LABEL` exported from `packages/mcp/src/tools/index.ts` with a snap-down formula (`${Math.floor(total/5)*5}+`) so the user-visible label stays stable across small additions and only triggers a manual copy update when the count crosses a 5-tool threshold. Single source of truth in code; downstream marketing copy updates when the threshold actually moves. Closes the credibility-hit-on-buyer-counting risk.
- Improved
Stack-decisions D225 / D226 / D227 / D228 + scope §15 + 4 new help-article entries
D225 documents the 30-min cadence + 25-min cutoff + fan-out cap reasoning. D226 documents the FTS+pgvector+RRF mirror of hybridSearch + graceful-fallback when no OpenAI key. D227 documents the shared Google OAuth + Auth.js env fallback + prompt=consent posture + Gmail historyId vs Drive page-token bootstrap shapes. D228 documents the snap-down marketing-count constant. Five-surface doc sweep in lockstep with the code ships.
v0.7.32Slack + Microsoft Graph trio (Outlook / SharePoint / OneDrive) end-to-end with searchable synced content, plus Vercel turbo remote-cache wiring (D222-D224)- Shipped
Slack connector E2E — OAuth + per-channel sync + searchable in /integrations/[id]
Bot-token OAuth v2 (`oauth.v2.access`) at `/api/oauth/connect/slack` and `/api/oauth/callback/slack`; CSRF state cookie via timing-safe-equal; `slackSyncWorker` walks `conversations.list` + `conversations.history` with per-channel `oldest`-ts cursor; messages persist to new `external_messages` table with HNSW pgvector index for the next-batch semantic upgrade. Deterministic Slack permalink format avoids the 1-RPS `chat.getPermalink` rate limit. Set `SLACK_CLIENT_ID` + `SLACK_CLIENT_SECRET` to enable; the /connect route returns a clear error if env is unset. /help/integrations + /help/canvas-search-external-content cover the rollout.
- Shipped
Microsoft Graph trio — Outlook + SharePoint + OneDrive in one OAuth dance
Single shared OAuth at `/api/oauth/connect/microsoft?kind=<outlook|sharepoint|onedrive>` and `/api/oauth/callback/microsoft`; state shape `<nonce>.microsoft:<kind>` carries vendor through the round-trip. Tenant defaults to `/common` (override via `MICROSOFT_OAUTH_TENANT`). All three workers use the Graph delta-cursor pattern (`@odata.deltaLink`): outlookSyncWorker → `external_messages` (HTML stripped to plain text), onedriveSyncWorker → `external_documents` (skip folders + soft-deletes), sharepointSyncWorker → /me/followedSites → /drives → /root/delta per drive. Refresh-token rotation handled. MAX_PAGES_PER_RUN=10 + MAX_ITEMS_PER_RUN=500 bound first-time backlog imports. Set `MICROSOFT_OAUTH_CLIENT_ID` + `MICROSOFT_OAUTH_CLIENT_SECRET` to enable.
- Shipped
New `external-connector-sync` Inngest function with concurrency.key=connectorId
Discriminated worker map per kind; persists messages + documents + cursor advances atomically per run inside a single transaction. Emits `external-connector.sync-completed` (payload includes counts + cursorAdvances) on success or `external-connector.sync-failed` on error — both land on the hash-chained audit log. /integrations/[id] adds a Sync now button (refuses on paused / revoked); cron-driven recurring sync is the next batch.
- Shipped
/integrations/[id] connector detail page — kind-aware recent-content view
New page renders status badge, scopes, active cursor count, last sync timestamp + error (if any), and the 50 most recent synced messages OR documents depending on the connector kind. Each row links back to the vendor permalink + shows author / channel / sent-at metadata. Sync now button triggers an Inngest event from the page; revalidation refreshes the recent-content list on next visit.
- Shipped
searchExternalContent MCP tool — agent searches across all synced connectors
New typed MCP tool: query string + optional kind filter (slack | gmail | outlook | sharepoint | onedrive | google-drive) + limit; returns ranked messages + documents with snippets and vendor URLs. Tenant-scoped (only authorized connectors). v1 uses ILIKE substring match; FTS + pgvector semantic upgrade is one batch away (embeddings already persist on both tables). The agent can now answer "search across our Slack and SharePoint for the Brennan matter" via a single MCP tool call.
- Improved
Vercel build cost — kodori-web routed through turbo for remote cache hits
Diagnosis: `vercel-build` was calling `next build` directly, bypassing turbo entirely — Vercel's auto-injected `TURBO_TOKEN` + `TURBO_TEAM` (Pro team auto-provisioning) had no effect because turbo never ran on the build machine. Fix: `pnpm turbo run build --filter=@kumokodo/web` is now the build path. Local validation: `pnpm typecheck` 56s → 825ms post-link ("FULL TURBO"). First Vercel-side post-change deploy was a cache miss (Linux-OS hash differs from Windows-local hash) but populated the Linux cache; subsequent deploys with no apps/web/** churn should hit. Goal: cut the $63.77/cycle kodori-web spend (~46% of total Vercel portfolio) by replacing cold rebuilds with cache hits. `db:migrate` stays outside turbo intentionally — runtime side effects shouldn't be cached.
- Improved
Stack-decisions D222 / D223 / D224 + scope §15 entries land alongside the code
D222 documents the Slack bot-token + per-channel-cursor + deterministic-permalink calls. D223 documents the shared Microsoft OAuth + Graph delta + file-as-text-not-mirror posture. D224 documents the Vercel build path + db:migrate-stays-outside-turbo + first-deploy-misses-Linux-hash decisions. Five-surface doc sweep posture preserved.
v0.7.31.1Rename a collection — `renameCollection` MCP tool + inline rename form on /collections/[id] (Roy-flagged gap)- Shipped
New `renameCollection` MCP tool — change a collection's display name without touching membership or rules
Mirrors the existing renameDocument shape: input is collectionId + name (max 512 chars), output is `{ changed, previous }`. Permission gate is the same blast-radius posture as renameDocument — creator OR tenant owner / admin. Refuses on a soft-deleted collection. Idempotent on no-op rename (returns changed: false without an audit event). Available to the agent through the MCP catalog and to the UI through a new `renameCollectionAction` server action.
- Shipped
Inline rename UI on /collections/[id]
Native `<details>` + `<summary>` toggle next to the H1 — admin / owner / creator sees a "Rename" link that opens an inline form pre-filled with the current name. Submit reloads the page with the new name. No client-side JS, no modal, no extra round-trip; the rename is a single server action that revalidates `/collections` + `/collections/[id]` so navigation reflects immediately.
- Shipped
New `collection.renamed` event type on the hash-chained audit log
Lands on the collection's stream alongside `collection.created`, `.member-added`, `.member-removed`, `.rule-updated`. Payload carries `{ collectionId, previous, next }` so a `/audit?eventType=collection.renamed&actor=<user>` query answers "who renamed which matter when" without a JOIN. /audit page chip catalog adds it under the existing Collections group.
- Fixed
Closes the gap Roy flagged — collections previously had no rename path
createCollection wrote a name; setCollectionRule could change rules; addDocumentToCollection / removeDocumentFromCollection moved members; but the name itself was effectively immutable post-creation, forcing operators to recreate the collection (which loses pinned membership + ACL grants). Fixed.
v0.7.31Six surfaces shipped from the deferral list — conversational canvas, tenant-key re-wrap orchestration, OPA/Cedar policy engine (shadow), six external connector kinds, browser perspective correction, offline mobile capture buffer (D216-D221)- Shipped
Conversational canvas — branchable agent runs with human-approve gates (/canvas)
Multi-step workflows render as a tree of nodes (tool-call / human-approve / branch / reduce) instead of a flat chat transcript. Owner / admin / contributor can spin up a canvas run, the agent advances tool-by-tool, and any node can be flagged human-approve so an admin must Approve / Reject before the run continues. Reject collapses the descendant subtree without rolling back upstream side-effects (existing reversibility primitives still apply at the tool level). Each node + status flip lands on the hash-chained audit log via 8 new event types (canvas-run.* started / advanced / paused / resumed / approved / rejected / completed / cancelled). Schema: canvas_runs + canvas_nodes with parent_node_id for tree shape; CHECK constraints on the 4 node kinds + 6 statuses. UI: /canvas list + /canvas/[id] detail with per-node Approve/Reject buttons and live status badges. /help/canvas walks the workflow.
- Shipped
Tenant-key re-wrap orchestration — Inngest function with concurrency.key=tenantId
When a BYO-key admin rotates their KMS key, Kodori needs to re-encrypt every per-document DEK against the new wrapping key. The orchestration ships today: an Inngest function with `concurrency.key=tenantId` (serial-per-tenant + parallel-cross-tenant), 5 lifecycle event types (tenant-kms.rewrap-requested / -progress / -completed / -orphaned / -acknowledged), refusal of acknowledge-orphaned on an active key. The actual DEK walk is naturally a no-op until the blob_dek schema lands; the orchestration shape is locked so the SDK integration drops in cleanly. /settings/security gets the Re-wrap button + status panel.
- Shipped
OPA/Cedar policy engine — shadow-mode rollout against TS gates (/policies)
Tenant admins can author Cedar DSL policies, simulate them against a sample of recent decisions, and activate / archive them. Status flow: draft → active → archived. Both hand-rolled TypeScript permission gates AND the Cedar evaluator run in shadow today; mismatches land on `policy.divergence-detected` audit events for review. TS remains authoritative until divergence telemetry shows zero mismatches over a soak window — only then does Cedar flip authoritative. 5 new event types (tenant.policy-created / -activated / -archived / -simulated / divergence-detected). UI: /policies admin list with sample-policy seed button + activate / archive row actions; /policies/[id] for the editor (Cedar runtime stub returns Allow as a permissive baseline; full @cedar-policy/cedar-wasm wiring documented at the drop-in point). /help/policies has the rollout playbook.
- Shipped
External connector foundation — six vendor kinds with encrypted token storage (/integrations)
Read-only connectors for Slack, Google Drive, Gmail, Microsoft SharePoint, OneDrive, and Outlook. external_connectors table stores OAuth tokens encrypted at rest via AES-256-GCM with a scrypt-derived key from AUTH_SECRET (drop-in extends to BYO-KMS once D217 SDK integration lands). external_connector_cursors tracks per-connector incremental-sync state. Lifecycle event types: external-connector.created / -refreshed / -paused / -resumed / -revoked / -synced. UI: /integrations admin grid with per-vendor Connect button + Pause / Resume / Revoke per connector. Per-vendor OAuth flows are stub-routed to the connector kind's authorize URL — the SDKs themselves drop in per the documented integration points. /help/integrations covers the lifecycle.
- Shipped
Browser-side perspective correction for mobile capture — opencv.js with graceful fallback
When a user captures a document via /capture from a phone, the browser detects document corners and applies a 4-point perspective warp before upload — the result lands on the server already de-skewed. Lazy-load avoids the ~9MB WASM cost on every pageload (only loads when the user enters /capture). Falls back to passthrough on detection-failure or unsupported browser without blocking the upload. Library helpers (`detectDocumentCorners`, `applyPerspectiveWarp`, `correctPerspective`) ship with full opencv.js code documented inline at the drop-in points; the API shape is locked so swapping in the real OpenCV calls is a one-line edit per helper. See /help/mobile-capture-polish.
- Shipped
Offline mobile capture buffer — IndexedDB queue + Background Sync service worker
Phone-on-jobsite captures lose connectivity all the time. The /capture page now writes failed uploads to an IndexedDB queue and registers a Background Sync `capture-drain` tag — the browser fires the sync event when connectivity returns, the service worker reads the queue, and POSTs every row to /api/v1/documents with the original metadata (mime type, display name, sensitivity, collection, captured-at timestamp). 201 deletes the row; non-201 leaves it for the next sync attempt (Background Sync's exponential backoff handles cadence). Service worker registered from /capture; foreground drain helper (`drainQueueForeground`) covers iOS Safari which doesn't support Background Sync yet. See /help/mobile-capture-polish.
- Improved
24 new event types added to EventTypeSchema across 5 lifecycle batches
8 canvas-run.* + 5 tenant-kms.rewrap-* + 5 tenant.policy-* + 6 external-connector.* event types land on the hash-chained audit log alongside the existing 200+. /audit page chip catalog updated with a new "API keys + digests" group covering all 24. Every consequential mutation in the new surfaces appends an event — no out-of-band state changes.
- Improved
Marketing-site count reconciliation — "30 controls" → "36 controls" across all surfaces
Updated /features pillar count + /compliance/reports/[slug] cover page + /help/soc2-controls-mapping body to reflect the six new D216-D221 controls. Single source-of-truth pattern means the count badge on the public /security/controls page derives from the same constant. Closes the doc-rot vector where a feature ships in code but the marketing site still claims yesterday's footprint.
v0.7.30New `kind: "uncollected"` source variant on every bulk-doc tool — agent now handles "pin every uncollected doc to RoyzTestDocs" with one tool call (Kodori Dev #152 follow-up)- Shipped
New built-in `kind: "uncollected"` source on bulkAddDocumentsToCollection / -RetentionClass / -Sensitivity / -LegalHold
Carries no other fields — the kind itself is the whole query. Resolves to "every live tenant doc with no row in collection_members" via Postgres NOT EXISTS. Pinned-only definition of "collected" — rule-derived collection membership is recomputed at read time and doesn't qualify a doc as collected (pinning is the explicit signal). Permission-trimmed downstream by the existing loadAllowedDocs path. Same SourceSchema shared across the three bulk-doc-operations tools so all three get the variant for free; bulkAddDocumentsToLegalHold has its own duplicate Source, extended for parity.
- Improved
System prompt extended with the natural-language → tool-call mapping
Operator phrases like "uncollected docs", "docs not in any collection", "the inbox", "loose docs" all map to source: { kind: "uncollected" } directly. Agent is told to use the variant rather than guess a saved-search name (which is what triggered Roy's original failed call on #152).
- Improved
3 new regression fixtures in bulk-source-schema-fixtures.ts
Cover the new variant on bulkAddDocumentsToCollection, bulkSetDocumentRetentionClass (no extra fields needed), and bulkSetDocumentSensitivity (still requires reason). Total evals: 52 passing.
v0.7.29Bulk-tool saved-search source — accept name OR UUID (closes Kodori Dev #152, customer-reported by Roy)- Fixed
bulkAddDocumentsToCollection / bulkSetDocumentRetentionClass / bulkSetDocumentSensitivity / bulkAddDocumentsToLegalHold now accept saved-search by name
Two stacked bugs: (1) the four bulk tools declared `source.savedSearchId: z.string().uuid()` while sibling runSavedSearch advertised "id OR name" via `idOrName: z.string().min(1)` — agents that learned the looser shape from one tool description carried it into the bulk tools and got Zod-rejected on slug-style identifiers; (2) even when a valid UUID was passed, the bulk handler called `runSavedSearchTool.input.parse({ savedSearchId: ... })` against runSavedSearch's actual `idOrName` field — would have failed at parse time. Fix: rename the field to `savedSearchIdOrName: z.string().min(1)` across all four bulk tools, delegate resolution to runSavedSearchTool which already accepts both shapes; remove the now-redundant pre-check.
- Improved
System prompt: "no inventing tenant-scoped identifiers" expanded beyond the original three
The original rule listed document IDs, collection IDs, user IDs. Now extended to saved-search IDs, retention class IDs, legal-hold IDs, API key IDs, webhook subscription IDs — with explicit guidance to call the listing tool first (listSavedSearches, listCollection, listRetentionClasses) when the user references a thing by name. Prompt-side guard backs up the schema-side fix.
- Improved
New regression test suite at packages/evals/src/bulk-source-schema-fixtures.ts
9 Zod-schema fixtures covering UUID accept, name accept, empty reject, legacy `savedSearchId` field-name reject (sentinel against re-introducing the inconsistency), and per-tool variants for retention + sensitivity. Total evals: 49 passing.
v0.7.28Phase 1/2/3 closeouts — JIT onboarding verified, evals harness expanded, Azure DocAI live, sync-companion CLI, BYO-key lifecycle audit- Shipped
Hash-chain integrity evals + web-side helper coverage (Phase 1)
`packages/evals` now ships per-MCP-tool harness + 4 fixture suites: predicate DSL (existing), DLP rules (existing), routing (existing), and audit hash-chain integrity (new — 9 fixtures covering every break kind: empty / single-genesis / multi-row valid / no-genesis / multiple-genesis / fork / orphan / tampered-payload). Refactored verifyAuditChain to expose `verifyChainRows(rows[])` + `chainHashOf(row)` as pure-function exports so synthetic chains can be built in memory without a test Postgres. New web-side coverage for D190 share-link domain allowlist (16 cases) + D196 audit diff extractor (12 cases). 39 evals + 45 web tests = 84 deterministic assertions.
- Shipped
Saved-search "new hits" section in the activity digest (§15.2)
D150 daily / weekly digest now includes a "Saved searches with new hits" section — for each user's saved search, runs a permission-trimmed FTS count constrained to the digest window with the saved search's sensitivity / mimeFamily filters applied; surfaces top 5 by count with deep links back to /search?savedSearch=<id>. Reuses the existing Resend send + reply-to + unsubscribe plumbing — no new email infra. The in-app "new since last viewed" badge on /search chips remains the read-time signal; the digest is the proactive push.
- Shipped
Azure Document Intelligence live wiring (§15.2)
Adapter at packages/workflow/src/extractors/azure-doc-intel.ts now does the real prebuilt-layout call: POST raw bytes to `/documentintelligence/documentModels/prebuilt-layout:analyze?api-version=2024-11-30` with Ocp-Apim-Subscription-Key, poll the Operation-Location URL every 2s up to 270s, reduce paragraphs → text + page count + first-word locale → ExtractResult. Self-reports supports=false until both AZURE_DOC_INTEL_ENDPOINT and AZURE_DOC_INTEL_KEY are set, so the registry walks past it cleanly when unconfigured. Ships behind the existing Azure-primary-Google-fallback-Claude-PDF-last cascade.
- Shipped
Sync-companion CLI for watch-folder ingest (§15.2)
New apps/sync-companion package shipping `kodori-sync` Node CLI. Watches configured paths via chokidar with awaitWriteFinish so half-written files don't get uploaded; POSTs new + modified files to /api/v1/documents using a documents:write-scoped API key. Three commands: `init` (writes starter config to ~/.kodori/sync-companion.json), default (foreground watcher), `--once` (one-shot backfill). Forwards X-Kodori-Sensitivity / X-Kodori-Collection-Id from config and adds X-Kodori-Metadata: { syncCompanion: { sourcePath, hostname, pushedAt } } so operators trace which sidecar pushed a doc on /audit + /doc/[id]. 50 MB sync-upload cap matches the API; oversize files emit a clear skip message instead of wasting the round-trip. Electron tray wrapper deferred to customer demand.
- Shipped
BYO-key lifecycle audit events (Phase 3)
Three new event types — tenant-kms.registered / tenant-kms.rotated / tenant-kms.disabled — emitted on every transition from registerKmsKeyAction + disableKmsKeyAction. Rotation events carry from/to keyIdSuffix (last 12 chars only — full keyId stays in tenant_kms_keys to avoid leaking ARNs in audit payloads) + provider so admins can read "rotated from aws-kms:…abc to gcp-kms:…xyz" inline via the D196 diff badge. /audit chip catalog updated under "API keys + digests". Closes the SOC 2 / 21 CFR Part 11 gap where key rotation silently flipped status without an audit trail. Re-wrap pipeline (live SDK calls + walk every wrapped DEK to re-encrypt against the new KEK) plan documented at docs/plans/tenant-key-rewrap-plan.md.
v0.7.27/share-links sortable columns — click a header to sort by Accesses / Last access / Expires / Created- Shipped
New URL-state sort on /share-links
?sort=<key>&dir=asc|desc URL params; clicking the same column header flips direction, clicking a different column resets to desc. Sort labels render an arrow indicator (↑ / ↓) on the active column. ISO-8601 lexical comparison for the timestamp columns is chronologically correct. Combines with the existing q + status filters — sort happens AFTER filter so "highest-accessed link in matter X" is one click.
- Shipped
Server-computed hrefs, client-side rendering preserved
Sort state lives in URL params (bookmarkable + middle-clickable for new tab); the existing ShareLinksClient component receives sortHrefs and sortLabels as props. No behavior changes to bulk select / revoke / verification roster — sort is purely additive.
v0.7.26/productions search + matter filter — find every production for matter Smith without scrolling- Shipped
New search form on /productions
Text input filters across name + recipient + matterRef + bates prefix + bates range (substring, case-insensitive). Matter dropdown auto-populated from distinct matterRefs in the result set — admins click "Smith — Matter ID" to scope quickly. URL-state captured (?q=...&matter=...) so filter combinations are bookmarkable.
- Shipped
"Showing N of M" counter when filters active
Same pattern as D194 /share-links search — counter shows "Showing 3 of 47" when filtered, "47 productions" otherwise. Empty-result state distinguishes "no productions yet" (helpful copy pointing at /bates-stamp) from "no productions match these filters" (try clearing).
v0.7.25/audit "since my last visit" filter chip — admin returning from PTO sees only events appended while they were away- Shipped
New "Since my last visit" chip on /audit
Visible only when the user has a previous /audit visit recorded (NULL = first-ever visit, chip hidden). Click toggles ?since=last-visit in URL state; chip label shows when the previous visit happened ("Since my last visit (2026-05-22 09:14)"). Plays well with the existing type / date / actor / stream filters — they all AND together.
- Shipped
Per-user, captured before page update
New nullable last_audit_visit_at column on users. Loaded BEFORE updating so the page renders against the PRIOR visit timestamp; updated best-effort after read so the next visit sees the just-completed visit's timestamp. Per-user — each admin's "since" is independent. The per-user-ness parallels saved /audit filters (D188).
- Shipped
Excluded from CSV / JSONL exports
Exports are typically downloaded for a downstream audience (auditors, SIEM) — the per-user "since" semantics don't translate. Admins exporting want explicit from / to date control. Keeping `since` UI-only avoids export-route complexity.
v0.7.24Per-tenant default share-link access cap — completes the share-link tenant-defaults quartet (allowlist + expiry + notify + max-access)- Shipped
New "Default share-link access cap" form on /settings/tenant
Owner / admin sets a workspace-default cap (1-1000). HIPAA shops typically want default 1 (every link expires after first download); ediscovery rolling productions typically want no cap. Empty = no cap by default (current behavior). Operators still override per-link.
- Shipped
Server-side fallback in createShareLinkAction
Reads the tenant default in the same SELECT that already loads D199 expiry + D201 notify defaults — single round-trip. New resolveMaxAccessCount helper centralizes the per-link-vs-tenant-fallback logic. Operator passing 0 / omitting the cap reads as "use tenant default" (or "no cap" when tenant has no default).
- Shipped
Audit-logged via tenant.settings-updated
Saved fields land on tenant.settings-updated with from/to in the delta — D196 inline diff badge surfaces "defaultShareLinkMaxAccessCount: null → 1" as a chip on /audit.
v0.7.23Regenerate AI summary on /doc/[id] — refresh stale summaries when context shifts or the original missed a key detail- Shipped
New "Regenerate" link beside the AI-summary callout
Click; Kodori re-fires the auto-classify Inngest function which recomputes the 3-sentence summary alongside sensitivity / collection / keywords / docType in the same Haiku call. Async — the button shows "Refreshing…" and the page refreshes after 8s to pick up the new summary text. Open to anyone with read access (the gate is LLM cost, not data sensitivity).
- Shipped
Audit-logged via document.auto-classify-requested
New event type captures who asked for the regenerate + a `reason: manual-regenerate` payload field, distinguishing manual user-driven refreshes from system-driven re-classifies (post-version-commit, post-extraction). Lands on the doc's audit stream alongside every other classification activity.
- Shipped
Auto-classify's global concurrency cap acts as the rate limiter
Auto-classify is concurrency-capped at 10 globally — a button-click flood queues rather than overrunning the LLM provider. No new rate-limiting layer needed; the existing Inngest concurrency knob does the job.
v0.7.22/anomalies CSV export — quarterly compliance "every anomaly + decision + reason" instead of hand-building from /audit filters- Shipped
New /api/anomalies/export route
Owner / admin only (matches the page-level gate). 50,000-row cap with comment-row truncation marker matching the audit CSV pattern. Optional ?from=YYYY-MM-DD&to=YYYY-MM-DD&status=open|acknowledged|dismissed|auto-paused query filters. Date filter scopes to last_seen_at so a dismissed-last-quarter signal still surfaces.
- Shipped
15-column shape including decision_note and full evidence JSON
Columns: id, kind, severity, status, actor_id/kind/email, occurrence_count, first_seen_at, last_seen_at, decided_by_id/email, decided_at, decision_note, evidence_json. The decision_note column captures D202 dismiss-with-reason text + the optional acknowledge note — compliance gets "47 dismissed last quarter, here's exactly why each one." Evidence kept as JSON since the shape varies per anomaly kind.
- Shipped
New "Export CSV ↓" link on /anomalies
Inline alongside the existing /audit + /agent-activity links at the bottom of the page. Tooltip explains the available filter query params for scripted export.
v0.7.21Dismiss-with-reason on /anomalies — required note + four quick-pick reason chips so "why dismissed?" is defensible in audit- Shipped
New dismiss expand-form mirrors the acknowledge flow
Click "Dismiss as false positive" on any anomaly row and an inline form expands with four quick-pick reason chips ("False positive — expected business activity", "Investigated, legitimate (action documented elsewhere)", "Duplicate of an earlier anomaly already triaged", "System / agent change planned and announced") above a free-text input. Confirm-dismiss button is disabled until a reason exists.
- Shipped
Reason required (NOT optional like acknowledge)
Acknowledge's note is "what action did you take" — auditable elsewhere. Dismiss's note IS the entire audit trail of "why we decided this was nothing", so it's required. Closes the compliance gap where "47 anomalies dismissed last quarter" carried no defensible rationale.
- Shipped
Chip text becomes the note verbatim, editable before confirm
Picking a chip pre-fills the note field; operators add specifics ("Q4 audit prep — high regulated reads expected") on top of the prefill before confirming. Note still lands as free-text in the existing anomaly.dismissed event payload — no schema change required.
v0.7.20Per-tenant default share-link "Email me when accessed" — completes the share-link tenant-defaults trilogy (allowlist + expiry + notify)- Shipped
New tri-state radio group on /settings/tenant
"Use global (on)" / "On" / "Off". HIPAA shops want default on (every access notifies); ediscovery platforms running high-volume delivery want default off (admins' inboxes don't fill up). NULL in the DB = use the global default (true). Operators still pick per-link; this only changes the form's pre-fill and the server-side fallback.
- Shipped
Server-side fallback in createShareLinkAction
Loads the tenant default in the same SELECT that already reads the D199 expiry default — no extra round-trip. Falls back to global true when the tenant didn't set one. Even API-driven share-link creation inherits the workspace default consistently.
- Shipped
Audit-logged via tenant.settings-updated
Saving the workspace default lands on tenant.settings-updated with from/to in the delta. The D196 inline diff badge surfaces "defaultShareLinkNotifyOnAccess: null → false" as a chip on /audit so admins can review who toggled the workspace-default and when.
v0.7.19/retention disposal countdown — admins see "Next eligible disposal: 2027-04-15 (in 11 months)" per class without spreadsheet math- Shipped
New "Next eligible disposal" line per retention class
Computed inline as min(documents.createdAt) + retainForYears years over LIVE (non-tombstoned) docs in the class. Tone-coded — red within 30 days, amber within 180 days, gray further out. Renders both ISO date (compliance write-ups) and relative-time ("in 11mo", "in 2y" for at-a-glance daily triage). Tail note distinguishes review-disposition ("lands on /retention/review for human confirmation") from auto-tombstone-disposition ("auto-tombstones via the daily cron").
- Shipped
Suppressed for empty classes
A retention class with zero live docs renders no countdown — there's nothing to dispose. Same posture as the /webhooks "no deliveries → neutral gray" pattern (D192).
- Shipped
Inline subselect on the existing /retention query, no denormalization
One extra `min(createdAt) FILTER (status = live)` subselect per class on the existing /retention page-load query — keeps the data fresh (no cron lag) and avoids the invalidation surface a denormalized column would require.
v0.7.18Per-tenant default share-link expiry days — admins set the workspace default once instead of accepting the global 14-day fallback- Shipped
New "Default share-link expiry" form on /settings/tenant
Owner / admin sets a workspace-default expiry (1-90 days). Different firm postures use different defaults — 7 days for HIPAA delivery, 30 for ediscovery rolling productions, 14 for general matter packages. Empty = use the global default (14). Per-link entry still wins.
- Shipped
Server-side fallback in createShareLinkAction
When the operator doesn't supply an explicit expiry, the action reads the tenant default; falls further back to the global 14-day default when the tenant didn't set one. Server-side means even API-driven share-link creation inherits the workspace default consistently — no client-side pre-fill plumbing required.
- Shipped
Audit-logged via tenant.settings-updated
Saving the workspace default lands on the audit chain via the existing tenant.settings-updated event with from/to in the delta. The D196 inline diff badge surfaces "defaultShareLinkExpiryDays: 14 → 7" as a chip on /audit.
v0.7.17Custodian acknowledgment progress bar on /legal-holds/[id] — "12 of 18 acknowledged (67%)" at-a-glance instead of counting rows- Shipped
New stacked tri-segment progress bar
Emerald = acknowledged, amber = notified-pending, gray = unsent. Header shows "X of Y acknowledged" with a color-coded percentage badge (emerald 100%, amber ≥50%, neutral otherwise). Per-segment hover tooltip surfaces the count. Lives above the existing Nudge / Re-send buttons so admins read status → action in one downward scan.
- Shipped
Hidden when no custodians exist
A fresh hold with no custodians yet doesn't render the bar — the empty-state copy below already prompts the operator to add custodians. Avoids the "0 of 0 acknowledged" clutter.
v0.7.16Tenant-wide default share-link recipient domain allowlist — admins set @firm.com once on /settings/tenant instead of re-typing on every share-link- Shipped
New "Default share-link recipient domain allowlist" form on /settings/tenant
Owner / admin sets a comma- or newline-separated list of allowed recipient domains. New share-links with recipient verification on inherit the default when the operator leaves the per-link "Restrict to email domains" field blank. Per-link entry still wins — operators override or explicitly clear per link by typing into the field.
- Shipped
Server-side fallback at create time, NOT client-side pre-fill
createShareLinkAction reads the tenant default when the input is empty AND verification is required. Server-side fallback means even API-driven share-link creation (future scriptable usage) inherits the default without the caller knowing about it — consistent across UI + API.
- Shipped
Audit-logged via tenant.settings-updated
Saving the workspace default lands on the audit chain via the existing tenant.settings-updated event with from/to in the delta, so admins can review who changed the default and when via /audit (and the new D196 inline diff badge surfaces it as a chip).
v0.7.15Inline diff badges on /audit — read mutation changes ("confidential → restricted") at-a-glance instead of expanding the row JSON- Shipped
New per-row diff chip on mutation events
tenant.settings-updated, api-key.scopes-changed, api-key.expiration-set, api-key.rate-limit-set, webhook.retry-policy-set, document.sensitivity-changed, and document.retention-class-changed all surface their from/to as compact `<from> → <to>` chips in the row summary. Multi-field events (tenant.settings-updated) render up to 3 chips with the field prefixed (`scopes: x → y`); single-value diffs are unprefixed.
- Shipped
30-char truncation + hover tooltip
Long values truncate inline with a hover tooltip showing the full from → to. Keeps the row layout stable when multiple chips render side-by-side. Title attr surfaces in the accessibility tree so screen-reader users get the same signal.
- Shipped
Non-mutation events stay unchanged
Most event types (document.created, share-link.accessed, etc.) are single-value not from/to — those rows render exactly as before with no badge. The extractor returns null when no recognized diff shape exists, so visual clutter is bounded to the events that actually changed something.
v0.7.14Bulk-redeliver failed webhook deliveries — after a receiver outage, replay every failure in one click instead of 50- Shipped
New "Redeliver all failed" form on /webhooks/[id]/deliveries
Visible only when at least one delivery has failed (no clutter when the sub is healthy). Window dropdown — last 24h / 7d / 14d / 30d. Click; Kodori re-fires every DISTINCT failed event in the window through the existing Inngest deliver function. selectDistinct on eventId means multiple failed retries of the same event collapse into one redelivery.
- Shipped
Hard cap of 500 unique events per call
Bounds the Inngest enqueue burst + the SQL distinct query result. Operators with > 500 failed in a window almost certainly have a misconfigured receiver, not a transient outage — narrow the window or fix-and-redeliver. The cap doesn't error; the action completes with the 500 it got and the operator re-runs for the rest.
- Shipped
Refuses on paused / revoked subs
Same posture as single-row redeliver and webhook-test-fire (D189) — explicit Resume action required. Operator intent isn't ambiguous; auto-resume would violate the explicit-action-required UX. Admin / owner only.
v0.7.13/share-links search + status filter — find every link to @firm.com or every revoked link from last quarter without scrolling- Shipped
New search form on /share-links
Text input filters across label + recipient hint + token prefix + target name (substring, case-insensitive). Status select filters to active / expired / revoked / exhausted. Both fields capture as URL query params (?q=...&status=...) so filter combinations are bookmarkable + shareable. Browser back-button returns to the previous filter. Works without JS — GET form action, not JS-driven onChange, so screen-reader and slow-client users get the same UX.
- Shipped
"Showing N of M" counter when filters are active
Counter shows "Showing 3 of 47" when filtered, "47 links" when not. The counter context is the load-bearing diagnostic — "I expected to see all 47 links but I see 3" is the operator's signal something's off. Clear-filters link appears only when at least one filter is active.
- Shipped
In-memory filter over the existing 200-row list cap
No new SQL query, no new action parameters — the filter runs in JS over the rows the existing listShareLinksAction already returns. Sub-millisecond at the 200-row cap, low blast radius for a small UX win. If a customer needs > 200 links surfaced (paginated list), the action cap is the right place to bump.
v0.7.12Per-tenant share-link watermark customization — firms put their identity in the watermark with custom diagonal stamp + header text- Shipped
New "Share-link watermark text" form section on /settings/tenant
Owner / admin sets two optional fields: a custom diagonal stamp (auto-uppercased on render — "ATTORNEYS' EYES ONLY", "PRIVILEGED & CONFIDENTIAL", "FOIA EXEMPT") and a custom header bar (firm name, matter prefix). Empty falls back to "CONFIDENTIAL" + workspace name. Each capped at 80 chars at the DB level.
- Shipped
Migration 0074 + DB CHECK constraints on the 80-char cap
Two nullable text columns on `tenants`: share_watermark_diagonal_text + share_watermark_header_text. CHECK constraints enforce length BETWEEN 1 AND 80 OR NULL — the cap is at the schema layer, not just app validation, so direct SQL writes (admin tools, future agent-driven config) can't drop a 5000-char string in.
- Shipped
Idempotent — only changed fields land on tenant.settings-updated
Saving with no changes is a no-op. The audit-event payload includes from / to for both fields when changed so admins can review "who changed the watermark text and when" via /audit.
- Shipped
Production share-links continue receiving NO watermark
D157's production-bytes-must-match-the-privilege-log integrity contract still holds. Custom watermark text doesn't change this — applying any watermark to a produced PDF still invalidates chain-of-custody. Operators wanting custom watermark on productions should re-export the production with a different cover sheet, not customize the runtime watermark.
v0.7.11Webhook delivery health inline on /webhooks list — admins triage "is this receiver still working?" at-a-glance without drilling into the deliveries log- Shipped
New per-subscription 7-day health badge: "7d: 94% · 23 deliveries · last fail 3h ago"
Every active subscription on /webhooks now renders the badge inline alongside last-delivery + Deliveries → link. Color-coded by tone: green when success ≥ 95% with no recent failures, amber when 80-95% OR a failure in the last 4h on an otherwise-healthy sub, red when < 80%. Hover shows the raw counts (e.g. "Last 7 days: 21 succeeded / 23 total (2 failed)").
- Shipped
"No deliveries" stays neutral gray
Silence isn't failure — a subscription with no matching events in 7 days renders "7d: no deliveries" in subtle gray. Green would falsely imply "this is working" when nothing tested it; red would falsely imply "this is broken" when the sub is just unused.
- Shipped
Single grouped SQL query, only when the tenant has active subs
One round-trip per /webhooks page-load aggregates (subscriptionId, total, failed, lastFailedAt) with `count(*) FILTER (WHERE status = 'failed')`. Tenants with zero active subs skip the query entirely. Low-traffic admin page so running on every visit beats the operational cost of a cron-rolled denormalized counter on webhook_subscriptions.
v0.7.10API key lifetime usage stats — admins see "12,847 calls · last call 2m ago" on /api-keys without grepping audit- Shipped
New "X calls" column on every /api-keys row
Lifetime accepted-auth count, bumped best-effort in the same code path that touches lastUsedAt on every successful API request. Atomic via SQL `total_requests + 1` increment so concurrent calls don't race-lose updates. Tells admins at a glance which keys are load-bearing ("12,847 calls") vs. safe-to-revoke ("no calls yet").
- Shipped
New relative-time "last call" — "2m ago", "3h ago", "5d ago"
Replaces the previous "last used 2026-05-12" date-only display with a compact relative timestamp. Past 30 days falls back to YYYY-MM-DD because relative-time becomes useless ("47d ago" is just noise). UTC-stable: uses absolute deltas not localized weekdays, so SSR + CSR render identically.
- Shipped
Best-effort, never blocks auth
The counter bump runs in the same `void db.update().catch(() => undefined)` shape as lastUsedAt — auth correctness does NOT depend on stats writes succeeding. Migration 0073 adds total_requests bigint default 0 on api_keys; bigint is the right type for unbounded growth (2^63 calls is forever).
v0.7.9Recipient email domain allowlist on share-links — verification gates can be pinned to @firm.com so opposing counsel can't verify from a personal gmail- Shipped
New "Restrict to email domains" textarea on the share-link form
Visible only when "Require recipient verification" is on. Comma- or newline-separated. Subdomains accepted (firm.com matches mail.firm.com). Up to 25 entries per link. Empty = any domain accepted (current behavior). Useful for HIPAA-scoped deliveries (@providers.healthnet), restricted productions (@jonesandsmith.com only), or pinning to one tenant's domain.
- Shipped
Verification refusal returns a generic error to prevent allowlist probing
When a recipient enters an email whose domain is not on the allowlist, the verification action refuses BEFORE generating + emailing a code. The recipient sees a generic "not authorized to verify on this link" message — leaking the precise reason would let attackers probe for valid domains by trial and error. The audit chain still captures the gate-trip (share-link.verification-failed event with `reason: domain-not-allowed` + the failed domain) so admins can see the refusal in the verification roster.
- Shipped
New "Domain-restricted" callout on /share-links/[id]/verifications
When a link is domain-restricted, the verification roster page renders an emerald callout listing the allowed domains. Defends "this production was pinned to @jonesandsmith.com — only opposing counsel could have verified" without operators digging into the share-link config.
v0.7.8Webhook test-fire — admins click "Send test" on /webhooks and Kodori dispatches a synthetic webhook.test event at the endpoint to verify signing + parsing without a real mutation- Shipped
New "Send test" link on every active subscription on /webhooks
Owner / admin only. Click it; Kodori posts a real signed payload to the endpoint with X-Kodori-Event: webhook.test, X-Kodori-Signature, X-Kodori-Timestamp — exact same code path as a real delivery so a passing test = real deliveries work. Refuses on paused / revoked subscriptions ("Resume it first") to prevent accidental re-arming via test action.
- Shipped
Real audit-chain entry — "<admin> tested webhook X at 14:32"
The test appends a webhook.test event scoped to a per-subscription stream `webhook/<id>/test` so the audit log captures who tested which subscription and when. Initiator email lands in the payload for receiver-side correlation. webhook.test added to the /audit chip catalog so admins can filter to test-only delivery attempts.
- Shipped
Direct-dispatch to the targeted subscription only
The test bypasses fanout so other subscriptions whose eventTypes filter accepts webhook.test don't receive the test — clean isolation, a test fired at sub A doesn't leak to sub B. Mechanism: appends the event via the plain event-store (NOT the fanout-wrapped one), then directly enqueues webhook/deliver.requested for the one subscription.
- Shipped
Receiver-friendly payload — isTest flag + initiator email + URL
Payload body includes `isTest: true` (second signal beyond the X-Kodori-Event header), `initiatedBy` (admin email so receivers can debug "who triggered this"), the destination URL (helps with multiplex behind a load balancer), and an explanatory `message` field that documents the test inside the body. Receivers verifying signature parse the body normally and immediately recognize a test from any of three signals.
v0.7.7Saved /audit filters — admins running compliance reviews and oncall triage save the filter combinations they re-build daily as one-click presets- Shipped
New "+ Save current filter" button on /audit
When any filter is active (types, from/to, actor, stream), a "+ Save current filter" button appears in the new Saved row at the top of the filter card. Click it, type a name, hit Enter — the preset shows up as a chip alongside any existing saved filters. Click the chip to load the filter in one click.
- Shipped
Per-user, per-tenant scope
Saved filters are scoped to the user that created them — the compliance officer's "monthly SOC 2 review" filter and the SRE's "today's anomalies" filter live in separate spaces, no cross-contamination. The chip row is empty when you haven't saved anything yet (no clutter for first-time users).
- Shipped
50-preset cap per user with refuse-empty validation
Trying to save with no filters applied returns a "apply at least one filter first" error in-place. The 50-preset-per-user cap is generous but bounded — when you hit it, the error message points at unused presets to delete. Each chip has a × button for one-click deletion.
- Shipped
Saved-filter chips are real <Link>s
Middle-click opens the saved filter in a new tab. Browser back/forward works correctly. No JS needed on the load path — only save / delete actions are JS-driven. Hover over a chip to see the saved filter description (e.g. "3 types · 2026-04-01 → 2026-04-30 · actor ~ counsel@firm").
v0.7.6Share-link recipient-verification roster — admins see who verified each link in one click, with a per-link `<verified>/<attempted>` count badge on /share-links- Shipped
New /share-links/[id]/verifications roster page
Owner / admin only. Lists every (email, requestedAt, verifiedAt, attempts) row for one share-link with a verified / pending / expired status badge per row. Tally header: "X verified · Y pending or failed · Z total attempts." Defends "X read this production at 14:32 after verifying their email" without manual /audit drilling.
- Shipped
New "Verified" column on /share-links
For links with `requires_recipient_verification = true`, the column renders `<verified>` (when nothing failed) or `<verified> / <attempted>` (when verified < attempted). The count itself links to the roster — no extra action button competing with the existing revoke + bulk-select affordances. Empty cell ("—") when verification is not required for the link.
- Shipped
Single grouped SQL read, conditional on verification-required links
When at least one row in the /share-links list has verification required, one extra round-trip groups the verifications by (shareLinkId, lower(email)) with `bool_or(verified_at is not null)` to produce per-link verified + attempted counts. Tenants without verification-required links pay zero query cost.
v0.7.5Share-link direct email delivery — Kodori sends the URL to the recipient when the recipient hint is an email, replacing the copy-paste-into-Slack workflow- Shipped
New "Email this URL directly to the recipient hint" toggle on the Share via link form
Available when the recipient hint is a valid email. Operator picks the toggle, clicks Create, and Kodori sends a styled email with the URL + workspace name + operator name + expiry date directly to the recipient. Reply-to set to the operator's email so opposing counsel can reply natively. Reduces the transcription-error surface on high-volume ediscovery delivery.
- Shipped
Best-effort delivery — failure to email doesn't un-create the share-link
If Resend fails or the email address is malformed, the share-link is still created and the operator gets the URL back to copy manually. Same posture as every other email-side action — the load-bearing surface (the share-link itself) doesn't depend on email succeeding.
v0.7.4Audit log JSONL export — sister format to CSV (D162) for SIEM ingestion. Splunk HEC, Datadog Logs, Sumo Logic ingest natively.- Shipped
New /api/audit/export.jsonl route
Same auth, same filters, same 50,000-row cap as /api/audit/export. JSONL = one JSON object per line; the payload jsonb is preserved as a real nested object instead of being CSV-stringified, so SIEM ingestion scripts don't need a JSON.parse step on the payload field. Header row is a synthetic `{ "_truncated": true, "_maxRows": 50000, "_note": "..." }` line when the result set was capped.
- Shipped
New "Export JSONL ↓" link on /audit next to the existing CSV export
Both formats use the same filter URL state (types, from, to, actor, stream) — operators tighten filters once, then export to whichever format their downstream tooling consumes. Tooltip on the JSONL link explains the format and the SIEM use case.
- Shipped
Content-Type: application/x-ndjson with X-Kodori-Truncated response header
Standard MIME type for newline-delimited JSON. The X-Kodori-Truncated: 1|0 response header lets script-driven exports detect the cap-hit case from the headers without parsing the body — useful for scheduled cron-style exports that need to alert when truncation occurs.
v0.7.3Workspace governance queue in the activity digest — admins see pending two-person deletes, open access requests, and recent anomalies in their digest email so the queue doesn't go unnoticed- Shipped
New "Workspace governance queue" section on the digest (admin-only)
Surfaces three counts when the daily / weekly digest fires: (1) two-person delete requests awaiting approval (D164), (2) access requests pending review (D166), (3) open anomalies from the past 24h (daily) or 7 days (weekly). Each row deep-links to the matching admin page. Hidden when zero — admins on workspaces with no governance queue activity don't see an empty section.
- Shipped
Closes the "I missed it on /audit" gap for admins not actively browsing
Pending two-person deletes were already badge-counted on the sidebar (D164) but only when the admin loaded the app. The digest brings them to email — admins on PTO or working in another tool now see queue-build-up by Monday morning. Same shape applies to access requests (D166 sidebar badge) and anomalies (the /anomalies queue).
v0.7.2Production set diff — operators doing rolling discovery (Production 1 → 2 → 3 → ...) one-click compare any two production sets to see what was added / removed / kept- Shipped
New /productions/diff page with picker + 3-section comparison
Without parameters, renders a picker for two productions. With ?a=<id>&b=<id>, three sections: only-in-A (amber), in-both (neutral), only-in-B (emerald). Per-row Bates ranges from each side surface so operators can verify "did we re-produce X with the same Bates range?" or "was this doc dropped between rolling sets?" In-both rows flag versionChanged when the same documentId was produced with different versionHashes (re-stamped or re-redacted) — usually intentional but worth verifying against the privilege log.
- Shipped
"Diff with another production →" link on every /productions/[id] page
Pre-fills the diff page's ?a= param with the current production. Operators on a production detail page click the link, pick the second production from the dropdown, and see the diff in two clicks total.
- Shipped
Bates ranges + sensitivity per side for forensic verification
The Bates columns from EACH side surface side-by-side so operators verify continuity of numbering across rolling productions ("Production 1 ended at SMITH001234; Production 2 starts at SMITH001235 — clean carryover"). Sensitivity tier shown per row in case of cross-production sensitivity drift.
v0.7.1Selective resend of legal-hold notices — operators nudge unacknowledged custodians without rotating tokens for the ones who have already acknowledged- Shipped
New "Nudge unacknowledged (N)" button on /legal-holds/[id]
Visible only when at least one custodian has been notified but hasn't acknowledged. Click to re-send the existing ack URL only to unacknowledged custodians; acknowledged custodians keep their existing tokens + state. Distinct from the existing "Re-send to all" button which rotates everyone's tokens (use after material scope changes). Both buttons coexist with clear titles explaining the semantic difference.
- Shipped
Existing sendNoticesAction extended with onlyUnacknowledged flag
Idempotent / additive change to the D158 action. When the form sends onlyUnacknowledged=1, the SQL filter adds AND acknowledged_at IS NULL — the per-custodian iteration is unchanged. Same audit-event shape (legal-hold.notice-sent per row); the redirect summary picks up a mode=unacknowledged param so the banner copy can reflect the targeted-resend semantic if desired in a future polish.
v0.7.0External recipient email verification on share-links — 6-digit code unlocking on top of TTL + cap + watermark. Defense-in-depth for restricted-recipient productions and HIPAA-scoped deliveries.- Shipped
"Require recipient to verify their email" toggle on the Share via link form
Default off. When enabled, the recipient hitting /share/[token] sees an email entry form instead of the doc. They submit their email, Kodori emails a 6-digit code (15-min TTL), they enter the code, and Kodori sets a 24-hour signed cookie scoped to the share-link. Subsequent visits within the cookie window bypass the gate. Stacks with TTL (D128) + access cap (D161) + watermark (D157) + access notification (D159).
- Shipped
New share_link_verifications table + 3 event types
Migration 0070_share_link_verification.sql adds requires_recipient_verification boolean column on share_links + a new share_link_verifications table with HMAC-hashed codes (keyed with AUTH_SECRET so a code stolen for one link can't be replayed elsewhere). Three new event types: share-link.verification-requested (email entered + code sent), share-link.verification-failed (wrong code; max 5 attempts before requiring a fresh request), share-link.verified (cookie set + access unlocked).
- Shipped
Privacy posture preserved on the audit chain
Verification request + verified events log only the email DOMAIN (not the full address) on the audit chain — the rationale: per-recipient ACL leakage shouldn't trickle through the audit log accessible to all admins. The full email is in the share_link_verifications table for in-product traceability but stays out of the immutable event payload.
- Shipped
Cookie scoped per-share-link; new tabs / other links re-prompt independently
kodori_sl_v_<shareLinkId> cookie is HMAC-signed with AUTH_SECRET + email, httpOnly + sameSite=lax + secure in production. The signature scope ensures that a cookie minted for share-link A can't accidentally unlock share-link B. 24-hour TTL — recipients re-verify after a day, balancing usability with defense-in-depth.
- Shipped
6 share-link event types now in /audit filter chips
Share links group expanded to all six lifecycle events: created / accessed / revoked + verification-requested / -failed / -verified. Audit consumers asking "show me every failed verification this week" filter on the new event-type for a clean signal.
v0.6.6Bulk-revoke API keys for offboarding — admins one-click revoke every active key created by a departing user. Plus the missing api-key.revoked audit event lands.- Shipped
New "Offboarding — bulk-revoke all keys created by a user" form on /api-keys
Admin / owner only. Lists every workspace user who has at least one active key (excluding the current admin — self-revoke is refused server-side). Picking a user + Submit revokes every active key whose createdBy matches them. Confirmation banner reports "Offboarded user@firm.com — revoked 3 active keys" with a deep-link back to the audit chain.
- Shipped
New api-key.revoked event type — closes the missing audit-chain gap
Until now, single-row revoke (revokeApiKeyAction) silently flipped revoked_at without emitting an audit event — a real audit gap that this shipment closes. Both the existing single-row revoke AND the new bulk-by-user revoke now emit api-key.revoked. Bulk events carry bulk: true + offboardedUserId + offboardedUserEmail in the payload so audit consumers distinguish offboarding sweeps from individual revokes (same pattern as D163 bulk-revoke share-links).
- Shipped
Refuses to revoke caller's own keys, NOT a foot-gun
bulkRevokeApiKeysByUserAction rejects targetUserId === actorId with a clear "Refusing to bulk-revoke your own keys" message. Single-key self-revoke is still available via the per-row Revoke button.
- Shipped
New api-key.revoked added to /audit filter chips
API keys + webhooks + digests group expanded. Audit consumers asking "show me every api-key revocation this quarter" filter on the new event-type for the answer.
v0.6.5Per-tenant anomaly detection thresholds — high-volume workspaces raise to suppress false positives, low-volume ones lower to catch smaller patterns. Defaults stay tuned for typical usage.- Shipped
4 new tenant.* columns: anomaly_window_minutes + 3 per-kind thresholds
Migration 0069_tenant_anomaly_thresholds.sql adds nullable integer columns to tenants for windowMinutes (5-1440), regulatedReadThreshold, agentVolumeThreshold, holdDenyThreshold. NULL on any column = use the platform default from DEFAULT_THRESHOLDS in packages/workflow/src/detectors/anomaly-detector.ts (60min / 25 / 200 / 5). DB CHECK constraints enforce > 0 (and 5-1440 for window).
- Shipped
New "Anomaly detection thresholds" section on /settings/tenant
Owner / admin only. Four input fields with defaults shown as placeholders. Empty input → revert to platform default. Range-validated server-side (windowMinutes 5-1440, others 1-100,000). Saving emits tenant.anomaly-thresholds-set on the audit chain with from/to deltas. Idempotent — no-op saves emit no event.
- Shipped
anomaly-sweep cron reads per-tenant overrides on every scan pass
The 15-minute Inngest sweep now LEFT JOINs the threshold columns when listing tenants and merges them with DEFAULT_THRESHOLDS via ?? cascade before invoking detectAnomalies. Tuning takes effect on the NEXT sweep tick (within 15 minutes of save). No restart, no migration, no env-var flip needed.
- Shipped
New tenant.anomaly-thresholds-set event added to /audit filter chips
Tenancy & billing chip group expanded. Audit consumers asking "who tuned anomaly detection in our workspace?" filter on the new event-type for the answer. The delta payload format mirrors tenant.settings-updated so existing consumers parsing settings deltas can extend with one additional event type.
v0.6.4Inactive-user filter on /members — admins one-click filter to members with no sign-in for N days for offboarding decisions- Shipped
New "Inactive 90+ days" filter chip on /members
Admin / owner only. Toggles between "All N members" and "Inactive 90+ days (M)" so the operator sees the offboarding queue at a glance. Adjustable threshold input (1-365 days, default 90) for workspaces with longer activity cycles. Inactive = lastSignInAt IS NULL OR lastSignInAt < now - N days — covers both never-signed-in accounts AND accounts that have gone dark. Builds directly on D168's lastSignInAt column.
- Shipped
Each inactive member row keeps the existing per-row Remove + role-change affordances
No new offboarding flow — the existing /members affordances (Remove button per row, role-change dropdown, Make-owner) all work identically inside the filtered view. Operators triage the inactive list, click Remove on the rows that should leave, and the standard removeMemberAction handles the rest (move to a fresh personal tenant, preserving the user's account + audit history while detaching them from this workspace).
- Shipped
In-memory filter on tenant-scoped query, NOT a separate offboarding-report query
The members query already loads all tenant users + their lastSignInAt; filtering happens in-memory after the read. Avoids a second DB round-trip + keeps the SQL simple. For workspaces with > 1000 members, the in-memory cost is negligible (1000 timestamps to filter). Revisit if customer reports slow page loads on a multi-thousand-member workspace.
v0.6.3Auto-delete reminder email — operators get a 14-day-out warning before the cron tombstones a scheduled doc, closing the "I forgot it was scheduled" failure mode- Shipped
Daily auto-delete sweep extended with a 14-day warning leg
The existing document-auto-delete-sweep cron (D171, runs at 02:00 UTC daily) now ALSO finds docs whose auto_delete_at is between now and now+14 days where last_auto_delete_warning_sent_at is null OR older than 13 days. For each, emails the doc creator + workspace owners + admins (deduped, capped at 50 recipients per doc) with the doc name, ID, scheduled date, reason, and a deep link to the doc. Stamps lastAutoDeleteWarningSentAt BEFORE the send so a transient retry doesn't double-fire.
- Shipped
Re-arms the warning timer on every set / clear / cancel
setAutoDeleteAction and clearAutoDeleteAction now both clear last_auto_delete_warning_sent_at — so a freshly-scheduled date generates a fresh 14-day-out reminder later. Successful tombstones in the cron also null the column to keep state clean post-deletion.
- Shipped
New document.auto-delete-warning-sent event type on the audit chain
Each warning email emits the event with documentId + autoDeleteAt + recipientsCount in the payload. Audit consumers asking "how many docs auto-deleted without warning this quarter?" filter on tombstone events that lack a preceding warning event on the same stream — clear forensic shape for "the operator was notified and ignored it" vs "the operator never got a heads-up".
- Shipped
/audit filter chip catalog Documents group expanded with 10 D164-D177 event types
delete-requested / .delete-approved / .delete-rejected (D164), marked-as-template / .unmarked-as-template / .cloned-from-template (D170), auto-delete-scheduled / .auto-delete-cleared / .auto-delete-warning-sent / .auto-delete-blocked-by-hold (D171 + D177). Closes the rough edge where these events were appearing on the chain but invisible to the type-filter UI.
v0.6.2Ad-hoc CSV export from /search — operators dump interactive search results to CSV without first saving the search- Shipped
New "Export CSV ↓" button on /search alongside the Save this search affordance
Visible whenever a query is active. Opens /api/search/export.csv?q=<query>&sensitivity=<>&mime=<> with the current /search filters. Streams up to 1000 hits as RFC 4180 CSV with the same column shape as the saved-search export (D172): documentId, displayName, mimeType, sensitivityLabel, sizeBytes, currentVersionHash, createdAt, lastModifiedAt. Header preamble carries the query, active filters, export timestamp, and row count + (capped) flag.
- Shipped
Same enforcement as the saved-search export — permission-trimmed, FTS only, capped at 1000
Reuses the canReadDocument SQL helper and the websearch_to_tsquery / display-name ILIKE shape from D172. Closes the friction of "save the search first" — operators running a one-off interactive query can dump the results immediately without polluting their saved-searches list.
v0.6.1Annotation thread auto-resolver — stale threads (30+ days inactive, unresolved) get auto-resolved by a daily cron, reducing ping-fatigue from old @-mentions on closed matters- Shipped
Daily Inngest cron at 04:00 UTC sweeps stale unresolved annotation threads
New annotation-stale-resolver finds root annotations (parentId IS NULL) where: resolvedAt IS NULL (still open), createdAt < now - 30 days, AND no reply newer than 30 days, AND the underlying document is still live. Auto-resolves them with actorKind=system + actorId=system:annotation-stale-resolver. Per-run cap of 500 to bound a single execution.
- Shipped
Reuses existing annotation.resolved event with payload.autoResolved=true
No new event type added. The existing annotation.resolved event's payload now carries autoResolved=true + staleDays=30 when the cron fires it, distinguishing cron-resolved threads from operator-resolved ones for audit consumers without polluting the EventTypeSchema. Standard pattern matched D163 (bulk: true on share-link.revoked) and D158 (cancelledByRequester: true on legal-hold delete events).
- Shipped
Closes the @-mention ping-fatigue loop on closed matters
Operators who get @-mentioned on a doc related to a now-closed matter were previously stuck with the open mention in /mentions forever — auto-resolve clears the @-mention from the inbox once the surrounding matter activity dies down. The audit chain still records the original mention; the resolution is the cleanup, not the deletion.
v0.6.0Per-subscription webhook retry policy — admins tune retries per subscription instead of relying on a single global default. Flaky downstreams get fewer retries; critical receivers get more.- Shipped
New max_retry_attempts column on webhook_subscriptions (range 1-10, default 4)
Migration 0067_webhook_max_retry_attempts.sql adds the column with a CHECK constraint enforcing the 1-10 range. Default 4 matches the prior global behavior so existing subscriptions keep working identically until an admin tunes them. Set via the new "Retries" input on the /webhooks per-subscription row.
- Shipped
Inngest function-level retries bumped to 10 (the upper bound across all subs)
The function-level cap is the absolute ceiling; the per-subscription max_retry_attempts column is the operator-controllable knob BELOW it. The deliver function reads the per-sub cap on each retry and self-aborts with a recorded failed-delivery row when attempts >= cap. Inngest stops scheduling more retries once the function returns success.
- Shipped
New "Retries" input on /webhooks per subscription
Number input alongside Pause and Revoke. Range 1-10. Save persists the new value + emits webhook.retry-policy-set on the audit chain with previousMaxRetryAttempts + maxRetryAttempts in the payload. Idempotent — saving the same value emits no event spam. Take effect on the NEXT delivery attempt; in-flight retries finish under the prior cap.
- Shipped
webhook.retry-policy-set event added to /audit filter chips
New event type registered in EventTypeSchema and surfaced in the /audit filter dropdown under "API keys + webhooks + digests" (the group label was extended to acknowledge webhooks). Audit consumers asking "who tuned which subscription's retries this quarter?" filter on the new event-type for the answer.
v0.5.9Production manifest CSV export — the standard ediscovery deliverable opposing counsel expects alongside a production share-link, generated in one click- Shipped
New /api/productions/[id]/manifest.csv route
GET the route while signed in (production must belong to the caller's tenant). Streams an RFC 4180 CSV with one row per produced doc: documentId, batesBeg, batesEnd, pageCount, displayName, mimeType, sensitivityLabel, versionHash. Sorted by Bates ranges (asc) so the manifest tracks the produced sequence. Header preamble carries the production name, matter ref, recipient, produced-at timestamp, Bates range, and document count — ediscovery vendors' ingestion scripts can detect + skip the # prefix.
- Shipped
versionHash captures EXACT bytes produced — verifiable post-delivery
The manifest references the exact version hash that was captured at recordProduction time, NOT the doc's current version. So a recipient verifying delivery against the manifest sees the bytes that were actually delivered, even if the doc has been re-stamped or further redacted in Kodori since the production. Same chain-of-custody read as the production share-link — the manifest is the textual evidence of what was sent.
- Shipped
"Download manifest CSV ↓" button on /productions/[id]
Inline alongside the existing "Share via link" button. One-click download, no UI walking, no confirmation modal — the manifest is a read-only artifact about a production already recorded. Works regardless of whether the production has a share-link issued (for productions delivered out-of-band, the manifest is still the primary record).
v0.5.8Saved-search results CSV export — dump every doc matching a saved search to RFC 4180 CSV for ediscovery custodian-list compilation, partner reporting, or external delivery- Shipped
New /api/saved-searches/[id]/export.csv route — up to 1000 hits per export
GET the route while signed in (saved search must belong to the caller). Streams an RFC 4180 CSV with columns: documentId, displayName, mimeType, sensitivityLabel, sizeBytes, currentVersionHash, createdAt, lastModifiedAt. Permission-trimmed via canReadDocument so the export only includes rows the operator can read. X-Kodori-Cap-Hit response header reports whether the 1000-row cap was reached.
- Shipped
Postgres FTS only — same cost posture as the saved-search alerts dispatcher
Mirrors D154's decision to skip semantic / vector embeddings on per-event work. The dominant operator intent is "dump every doc matching this saved search" which is keyword-shaped — semantic-only matches that don't hit the FTS query don't appear in the export. websearch_to_tsquery handles natural-language queries ("smith engagement") so operators don't need to know FTS query syntax.
- Shipped
Saved search's sensitivity + mime-family filters carry through
A saved search with `sensitivity=confidential` exports only confidential matches; `mimeFamily=pdf` exports only PDF matches. Same filter shape as /search and the saved-search dispatcher — operators get exactly the rows they'd see if they ran the search interactively.
- Shipped
Export click audit-logged via saved-search-alert.fired with kind="csv-export"
Best-effort append on every export so operators reviewing the audit chain see who exported which saved search when, with row-count and cap-hit metadata. Distinguishable from email-fired events by the `kind` field on the payload.
- Shipped
"Export matching docs to CSV ↓" link on /search/alerts per saved search
Inline ochre link below the saved-search metadata. Two-click access from the dashboard: open /search/alerts → click the export link on the row.
v0.5.7Per-doc scheduled deletion — set an expiration date on a single document, daily cron auto-tombstones at the date, legal-hold deny-wins still applies- Shipped
Schedule auto-deletion on /doc/[id] (admin / owner only)
New "Scheduled deletion" section above the Template section. Pick a future date + (optional) reason, click Schedule. Capped at 10 years out. Common patterns: NDAs that expire 5 years post-signing, marketing collateral with campaign deadlines, contractor records with mandatory purge dates. Distinct from retention classes — those are class-level POLICY with human-confirmed disposal via /retention/review; auto-delete is a one-off automatic action with no human in the loop at deletion time. Cancellable at any time before the date arrives via the "Cancel scheduled deletion" button.
- Shipped
Daily Inngest sweep tombstones expired docs at 02:00 UTC
New document-auto-delete-sweep cron finds live docs whose auto_delete_at has passed; invokes the standard tombstoneDocumentTool with reason="Auto-delete scheduled on YYYY-MM-DD: <operator reason>". Per-run cap of 200 docs. Cleared rows: when the cron successfully tombstones, it also nulls out auto_delete_at on the row to keep the partial index small.
- Shipped
Legal-hold deny-wins still applies — held docs survive past their auto-delete date
tombstoneDocumentTool refuses held docs regardless of caller (operator or cron). The sweep catches the refusal and emits a dedicated `document.auto-delete-blocked-by-hold` event so the audit log distinguishes "held doc survived past expiry" from "tombstone successful" — forensic clarity for auditors reviewing why a doc the operator scheduled for deletion is still there. The held doc keeps its auto_delete_at value; once the hold releases, the next sweep catches it.
- Shipped
Four new event types — full lifecycle on the audit chain
document.auto-delete-scheduled (operator), .auto-delete-cleared (operator cancellation), .auto-deleted (cron success — emitted via the standard document.tombstoned event), .auto-delete-blocked-by-hold (cron refusal). Audit consumers asking "every doc auto-deleted this quarter" filter on document.tombstoned with payload reason starting "Auto-delete scheduled" — or use the .auto-delete-scheduled stream for the lineage from intent to execution.
v0.5.6Document templates — mark any doc as a template, then "New from template" creates fresh docs at the same content hash with new metadata in one click- Shipped
Mark / unmark as template on the doc detail page
New "Template" section above Danger zone on /doc/[id]. One-click toggle promotes any live doc into a template; another click demotes it back. Idempotent — re-toggling to the same value emits no event spam. Audit-logged via document.marked-as-template / .unmarked-as-template.
- Shipped
New /templates surface lists every template in the workspace
Each template row shows the display name, mime type, sensitivity badge, creator, and an inline "New from template" form: enter the new doc name + (optional) target collection, click Create. The new doc inherits the template's currentVersionHash directly — content-addressable storage means zero blob duplication, infinite templates cost the same as a single uploaded doc. Sensitivity tier + mime type carry forward; metadata gets a fresh empty payload tagged with the template lineage in `clonedFrom`.
- Shipped
Optional one-click pinning into a target collection
When the operator picks a collection in the "New from template" form, the create transaction also writes a collection_members row in the same atomic step — the new doc is in the collection by the time the redirect to /doc/[id] lands. No second click, no race-window.
- Shipped
Three new event types — document.marked-as-template / .unmarked-as-template / .cloned-from-template
The hash-chained audit log captures the full lifecycle. The cloned-from-template event payload carries `templateId`, `versionHash`, and `collectionId` (when pinned) so audit consumers can answer "what came out of the Smith engagement template?" with one stream filter.
- Shipped
Sidebar nav — Documents > Templates
New Templates entry under the Documents group on the workspace sidebar so the surface is one click from anywhere in the app.
v0.5.5API key expiration + rotation reminders — set 30/90/180/365-day expirations on keys, get an email when a key is 7 days from expiring, auth refuses requests after expiry- Shipped
Set expiration on key creation + edit on existing keys
New "Expiration" picker on the create-API-key form (Never / 30 / 90 / 180 / 365 days). Existing keys get an "Expiration" section on /api-keys/[id]/usage with the same picker. Capped at 365 days from now to align with standard rotation policies. NULL means "never" (manual revoke is the only deactivation path).
- Shipped
Auth path refuses expired keys with a 401
verifyApiKey() now checks expires_at after the row lookup but before the secret-hash comparison — every request after expiry returns 401 with reason=key-expired. Effectively auto-revokes the key without requiring a separate revoke action. Same 401 response shape as a revoked key for parity.
- Shipped
Daily Inngest sweep — emails owners + admins inside the 7-day window
New api-key-expiration-sweep cron runs daily at 03:00 UTC. Finds active (non-revoked) keys whose expires_at is within 7 days; emails workspace owners + admins with the key name, prefix, expiry date, and a /api-keys deep link. Throttled via lastExpirationWarningSentAt — same key doesn't generate a daily reminder for the full 7-day window (re-warns at most once per ~6 days). Also emits api-key.expired events for keys whose expiry has just passed.
- Shipped
Three new event types — full lifecycle on the audit log
api-key.expiration-set (operator action), api-key.expiration-warning-sent (cron action, inside window), api-key.expired (cron action, post-expiry). The hash-chained audit log thus records who set the expiration, when each warning was sent, and the moment the key crossed the line. /audit filter coverage updated.
- Shipped
/api-keys list surfaces "expires in 5d" / "expired 3d ago" inline
Per-row expiration badge on the active-keys list: "no expiration" (gray), "expires YYYY-MM-DD" (gray, > 7 days out), "expires in Nd" (amber, ≤ 7 days), "expired Nd ago" (red, past). Same info on the /usage page header.
v0.5.4Last-seen tracking on /members — every member row now shows when they last signed in for offboarding decisions and SOC 2 user-activity reviews- Shipped
New last_sign_in_at column on users
Migration 0063_user_last_sign_in.sql adds last_sign_in_at timestamptz + an index keyed on (tenant_id, last_sign_in_at DESC NULLS LAST) for fast "who hasn't signed in in 90 days?" admin queries. NULL until the user first signs in after the deploy — existing users start with no value and populate on their next sign-in.
- Shipped
JIT user-upsert path stamps lastSignInAt on every sign-in
upsertUserOnSignIn now updates lastSignInAt on all three branches: existingByExternal lookup (the dominant path for returning users), legacy (tenantId, email) fallback, and first-time user creation (stamps at insert). Auth.js JWT callback runs the upsert per session refresh, so the indicator stays current within seconds of activity.
- Shipped
/members surfaces "Last seen X" alongside the joined date
New formatLastSeen helper renders per-row: "last seen today", "last seen 3d ago", "last seen 2mo ago", or "last seen YYYY-MM-DD" for older signs. "last seen never" for accounts that have never signed in since the deploy. Owners + admins making offboarding decisions ("this admin hasn't logged in for 6 months — deactivate?") see the diagnostic at a glance instead of grep-ing the audit log.
v0.5.3Internal download watermarking — confidential+ PDFs downloaded by workspace members now carry the workspace, downloader email, and access date burned in- Shipped
PDFs at sensitivity ≥ confidential get a per-page watermark on internal download
Extends D157 (which only watermarked share-link PDFs) to all internal PDF downloads. Three-layer stamp: header bar (workspace name + sensitivity tier), diagonal CONFIDENTIAL / RESTRICTED / REGULATED stamp at 18% opacity ochre at the page center, footer reading "Downloaded by <email> on YYYY-MM-DD · Kodori". Closes the chain-of-custody gap where an internal user could download a regulated doc to their laptop without any visible mark — now every screenshot or forwarded copy traces back to the originating download via the email + date in the footer.
- Shipped
Public + internal docs keep the existing 302 redirect path
Watermarking only fires when sensitivity ≥ confidential AND mime is PDF. public + internal docs continue to redirect 302 to a presigned R2 URL — straight from blob storage, no Vercel egress. Confidential+ traffic streams through the watermark route, which is a small share of total download volume but the share that matters for governance.
- Shipped
Stamp text tracks sensitivity tier — CONFIDENTIAL / RESTRICTED / REGULATED
A regulated doc reads visibly differently from a confidential one — operators forwarding a screenshot of a download see the actual tier the document was classified at, not a generic CONFIDENTIAL stamp. Same diagonal ochre at 18% opacity for all three tiers; the text is the only difference.
v0.5.2Per-resource access requests — workspace members can ask for read on a doc / collection they can't see, owners approve from a queue- Shipped
New /request-access form for any workspace member
Member pastes a document or collection id + an optional reason and submits. Privacy-preserving by design — Kodori does NOT confirm the resource exists before queueing the request, so a typo or random UUID just sits in the queue until denied without leaking any information about which ids exist.
- Shipped
New /access-requests admin queue
Owner / admin only. Resolves resource names where possible (live document name, collection name) for context; flags "resource not found" with an amber chip when the request points at a missing or tombstoned id (Grant button disables; operators are nudged to Deny instead). Grant + Deny affordances inline; deny takes an optional decision-note that lands on the audit chain.
- Shipped
Granting creates a permission row + emits TWO audit events
access-request.granted on the access-request stream, AND a standard permission.granted event on the resource's stream with `grantedViaAccessRequest: <id>` cross-reference in the payload. The audit chain treats granted-via-request identically to admin-issued grants, while the cross-reference lets audit consumers trace "where did this grant come from?" backwards.
- Shipped
Sidebar badge for owners + admins
When the access-request queue has pending items, the count surfaces as an ochre badge next to the new Governance > Access requests nav entry. Hidden when zero. Parallels the D164 pending-deletions badge pattern. New Workspace > Request access nav entry is always visible to non-admin members so they can find the form without bookmarking the URL.
- Shipped
Email notification to owners + admins on submit
New sendAccessRequestEmail Resend helper fans out to up to 50 owner/admin recipients per submission. Body carries the requester email, resource kind + id, the reason (if provided), and a deep link to /access-requests. Best-effort — email failure doesn't block the request submission since the queue is the load-bearing surface.
v0.5.1Smart redaction suggestions — Haiku scans the doc text and proposes a checklist of SSN / CC / DOB / privileged-language candidates before you ship the production- Shipped
New "Privacy scan" button on /doc/[id]/redact
Click and Kodori loads the document's extracted text, runs Haiku via the existing model provider with a structured-output prompt, and returns a list of redaction candidates across 12 categories: us-ssn, credit-card, bank-account, phone-number, email-address, date-of-birth, street-address, medical-record-number, attorney-client-privileged, attorney-work-product, trade-secret, other-pii. Each candidate carries a verbatim snippet, a one-sentence reasoning, a confidence band (high / medium / low), and a page number when the extractor preserved it.
- Shipped
Checklist UI sits next to the redaction canvas
Color-graded chips by category (PII red, contact blue, identifiers amber, privilege purple, trade secret emerald). Per-card dismiss to hide false positives without re-running the scan. Operator uses the list as a checklist while drawing redaction boxes manually on the canvas — Kodori does not auto-draw because the extracted text doesn't carry per-character coordinates (revisit when we add an OCR coordinate map). The dominant operator intent — "what should I be looking for?" — is the load-bearing value, not pixel-precise auto-redaction.
- Shipped
60k char input cap + Haiku for cost-flat scanning
The scan caps at 60,000 chars of input — covers a typical 100-page deposition transcript. For longer docs, the operator runs the scan, burns the first batch of redactions, then re-extracts and re-runs the scan on the cleaned bytes. Haiku 4.5 keeps the per-scan cost negligible (~$0.001) so daily-use is not a budget concern. Permission-trimmed via userCanReadDocument before extracted text is loaded.
v0.5.0Two-person delete on regulated documents — dual-control governance for healthcare / finance / government workspaces- Shipped
Regulated-sensitivity docs require a SECOND admin to approve before tombstone fires
Click "Request deletion" on a regulated document and instead of an immediate tombstone, Kodori creates a pending_deletions row carrying the requester, reason, and 14-day expiry. The doc-detail Danger zone now surfaces the pending state instead of a Delete button — clearly says "Pending two-person delete approval — requested by X on date — expires Y". Other sensitivities (public / internal / confidential / restricted) continue to use the existing single-person delete flow.
- Shipped
New /pending-deletions admin queue surfaces all open requests
Owner / admin only. Lists every active pending-deletion with the doc name, the requester (≠ current user is enforced server-side), the reason, and Approve / Reject affordances. Approve invokes the standard tombstone path — the legal-hold deny-wins gate still applies if a hold was applied between request and approval. Reject takes an optional rejection note that lands on the audit chain.
- Shipped
Three new event types — full lifecycle on the audit log
document.delete-requested, document.delete-approved, document.delete-rejected. The hash-chained audit log thus records who requested, who approved (or rejected, with their note), and when. Approval records BEFORE the tombstone fires so the chain captures the human decision separate from the destruction event. Cancellation by the original requester emits a delete-rejected with `cancelledByRequester: true` so the audit can distinguish "rejected by reviewer" from "self-cancelled".
- Shipped
Partial unique index — only one ACTIVE pending request per doc at a time
Approved + rejected rows are preserved as audit evidence; a fresh pending request can coexist with prior decisions on the same doc (when an operator tries again after rejection). Reduces "ghost requests" by surfacing a unique-constraint violation if two operators submit overlapping requests.
v0.4.6Bulk revoke share-links — clean up everything from a closed matter in one click instead of revoking N links one at a time- Shipped
Checkbox + Revoke selected on /share-links
Each active share-link gets a checkbox; expired / revoked / exhausted rows don't (revoking those is a no-op anyway). Select-all-active button in the header for the common "everything still live for this matter" pattern. Hard cap of 200 ids per submission. Per-row failures (already-revoked, not-found) are counted but don't halt the batch — confirmation banner reports "Revoked N · M already revoked · K not found".
- Shipped
Each successful revoke emits the same share-link.revoked event
No new event type; the audit chain treats a bulk revoke identically to N individual revokes. Each event carries `bulk: true` in its payload so audit-log filtering can distinguish bulk operations if needed without breaking existing consumers of share-link.revoked.
v0.4.5/audit log gets full coverage — 30+ recently-added event types now in the filter dropdown plus free-text streamId search for jumping into one matter or share-link's history- Shipped
Audit filter chips updated to cover every event type the system emits
Five new event-type groups added to /audit: Share links (created / accessed / revoked), Productions (recorded), Citation + saved-search alerts (10 lifecycle types), AEC drawing-set integrity, Compliance + audit (evidence packets, audit-chain verification, retention class lifecycle), API keys + digests. Documents group gained the redaction-added / removed types. Permissions & legal group now includes the four legal-hold custodian event types from D158. Annotations gained resolved + reopened. Tenancy gained tenant.settings-updated. Closes the rough edge where many events were appearing in the audit log but were invisible to the type-filter dropdown.
- Shipped
Free-text Stream filter — substring match on streamId
New "Stream" input alongside Actor / From / To. Paste a stream prefix (e.g. share-link/, legal-hold/<id>) or any partial id and the audit log filters to events on streams matching that substring. Fast because streamId already has a B-tree index. Standard query: paste a matter-id from /legal-holds/[id] into Stream and see every event on that hold's audit chain in chronological order.
- Shipped
Stream + types params honored on CSV export and pagination
The CSV export route + the Load older pagination link both pass the stream filter through alongside actor / dates. An auditor exporting "every event on share-link/<id>" gets exactly that subset; pagination through filtered results doesn't lose the filter on Load older.
v0.4.4Per-share-link access cap — "expire after first download" or "max N opens" defense-in-depth on top of TTL + revoke- Shipped
New "Cap total accesses to N opens" toggle on the Share via link form
Default unlimited. When set (1–1000), the share-link auto-stops serving once accessCount reaches the cap — both /share/[token] view + the underlying download routes return 404. Stacks with the expiry-date TTL + manual revoke; whichever fires first wins. The standard "expire after first download" (cap=1) is a single click; longer caps (e.g. 5 opens for a 5-recipient distribution) work the same way.
- Shipped
Status surfaces on /share-links: "exhausted" badge + remaining-opens counter
The share-links list shows the access counter as "<used> / <cap>" when capped. New "exhausted" status (rendered amber alongside expired) when the access cap is hit but the link hasn't been revoked. Helps operators triage "did the recipient burn through their opens already?" without grep-ing the audit log.
- Shipped
Cap captured on the share-link.created event payload
The created event now records maxAccessCount + notifyOnAccess in its payload so the audit chain captures the security posture at creation, not just retroactively. An auditor reviewing "was this link single-shot or unlimited?" reads it from the immutable event log, not the (mutable) row.
v0.4.3AEC schedule-risk section in the activity digest — overdue + due-soon RFIs and submittals show up in your daily / weekly email so you stop being surprised by the GC- Shipped
New "AEC schedule risk" section on the digest email
When the daily / weekly activity digest fires (D150), Kodori now adds an AEC schedule-risk block listing per-status counts: RFIs past their answer-by date, submittals past their required-by date, RFIs due in the next 7 days, submittals due in the next 7 days. Each row deep-links to the matching tracker page so the operator one-clicks into the actionable list. Hidden when zero — operators on workspaces that don't use the AEC trackers don't see an empty section.
- Shipped
Tenant-scoped count, not per-user
Closes the "the GC noticed an overdue submittal before we did" embarrassment. Counts at the tenant level so every team member sees the same number — the trackers themselves are already permission-trimmed at the document level, so what shows up in the digest is the load-bearing "your project is X items behind" signal.
v0.4.2Share-link access notifications — get an email when opposing counsel opens your production set, throttled to one notification per 4-hour window- Shipped
Per-link "Email me when accessed" toggle on the share-link form
New checkbox alongside Label / Recipient / Expiry on the Share via link form (default ON). Operators get a chain-of-custody email the moment opposing counsel opens the URL — useful for FRCP timeliness arguments and the everyday "did they actually receive this?" question. Operators who don't want noise flip the toggle off; the link still works, just silently.
- Shipped
New share-link-access-notifier Inngest dispatcher subscribed to event/appended
When a share-link.accessed event lands on the audit log, the dispatcher loads the source share-link row, checks notifyOnAccess + last-notified throttle, resolves the creator email + workspace name, and emails them via the new sendShareLinkAccessEmail Resend helper. Throttle: one notification per (link, 4-hour window) so a recipient hitting the URL 50 times in an afternoon doesn't generate 50 emails. The audit log still records every access via share-link.accessed regardless — the throttle is purely on email volume.
- Shipped
Migration 0059_share_link_notify_access.sql — additive, two new nullable columns
Adds notify_on_access boolean (default true) and last_notified_at timestamptz. Existing share-links default to notifications-on retroactively, which matches the dominant operator intent (the chain-of-custody read is what most legal users want). Operators who created prior links and don't want notifications can revoke + re-create with the toggle off.
v0.4.1Litigation hold notice emails — name custodians, send signed acknowledgment URLs, get an audit-logged "I have received this" click back- Shipped
Custodian roster on /legal-holds/[id]
A custodian is the named person responsible for preserving documents within a hold's scope. Closes the gap that custodians are typically NOT Kodori workspace members (they're internal employees with no DMS account, external counsel, or vendors). The matter owner pastes a list of emails on /legal-holds/[id] (same paste-many shape as the bulk-invite form), Kodori dedupes against existing custodians on the hold, and persists each new row with the hold + tenant FK.
- Shipped
Send (and re-send) hold notice emails
A "Send hold notices" button on the custodian table emails every custodian on the hold a litigation-hold notice with: matter ref, the workspace name, the verbatim hold scope from `description`, and a single-use acknowledgment URL. Re-send rotates each custodian's ack token (so a leaked URL from a prior send becomes inert) and clears any prior acknowledgment so the recipient is asked to ack the latest version of the scope.
- Shipped
Public acknowledgment page at /legal-hold-ack/[token]
Same token-as-auth pattern as share links. The page renders the matter ref, scope, workspace name, and a single "I acknowledge this notice" button. Clicking stamps acknowledgedAt and appends `legal-hold.notice-acknowledged` to the matter audit stream with actorId=`public-hold-ack:<prefix>` so the chain captures external acks distinguishably from authenticated workspace activity. Idempotent — repeat clicks after the first don't double-stamp.
- Shipped
Per-custodian status surfaced on the hold page
The custodian table on /legal-holds/[id] shows for each row: notice-sent date, ack status (Acknowledged YYYY-MM-DD / Pending / Not sent), and a Remove button. The matter owner can see at a glance who's acknowledged and who hasn't — the standard "are we covered?" question for a litigation hold rollout.
- Shipped
Four new event types — full lifecycle on the audit log
legal-hold.custodian-added, legal-hold.custodian-removed, legal-hold.notice-sent, legal-hold.notice-acknowledged. The hash-chained audit log captures every state transition. New migration 0058_legal_hold_custodians.sql + journal entry. The `legal-hold.applied` and `legal-hold.released` events from D101 stay unchanged — these four are additive lifecycle events.
v0.4.0Confidentiality watermark on share-link PDF downloads — every external download now carries the workspace name, token, and access date burned in- Shipped
PDFs served through share links get a per-page CONFIDENTIAL watermark
When a recipient opens a share link to a PDF, Kodori now stamps every page on-the-fly with three layers: a workspace-name header bar, a diagonal "CONFIDENTIAL" stamp at 18% opacity ochre across the page center, and a footer reading "Confidential · Shared via Kodori on YYYY-MM-DD · Token <prefix>". The stamp is on-the-fly because the original blob is content-addressable (SHA-256 of the bytes IS the version) — forking a watermarked-per-link version would either bloat storage or break the address. Stamping at request time costs ~15-30ms per page via pdf-lib and is bounded by the share link's own access TTL.
- Shipped
Production-kind share-links exempt — verbatim Bates bytes preserved
Document and collection share-links get watermarked. Production-kind links DO NOT — those serve the verbatim Bates-stamped bytes captured at production time, which must match the privilege log byte-for-byte for ediscovery integrity. Adding a Kodori watermark to a produced PDF would break the chain-of-custody claim opposing counsel relies on. Each kind has the right behavior for its purpose.
- Shipped
Non-PDF mimes pass through untouched
Watermarking only fires for application/pdf. DOCX, images, ZIPs, and everything else stream through bytes-unchanged. Defensive try/catch around pdf-lib load so a corrupted PDF doesn't fail the download — falls back to original bytes.
v0.3.9Bulk member invite — paste a CSV / newline-delimited list of emails and onboard a whole team in one click- Shipped
Paste-many invite UI on /members
New "Invite many at once" panel sits beside the existing single-invite form. Drop a textarea of emails — comma, semicolon, newline, or any whitespace separated — pick the role to apply to all of them, hit Send. Single role per batch (mixed roles still go through the one-at-a-time form). Surfaces a banner afterwards with sent / duplicate / invalid / seat-capped / send-failed counts so you can tell at a glance whether you need to top up seats or fix a typo.
- Shipped
Robust dedup, validation, and seat-cap enforcement
Pre-loads existing members + still-pending invites in a single query so duplicates fall through silently with a count instead of a hard error. Email-shape validation runs per-token (anything malformed is bucketed as `invalid` rather than blocking the batch). Per-email seat-cap check against billing/enforce.ts — once you hit the cap, remaining emails fall into the seat-capped bucket. Race protection via try/catch on the unique-constraint violation if two admins paste overlapping batches simultaneously. Hard cap of 100 emails per submission.
- Shipped
Same audit trail as single invites
Each successful invite appends the same `permission.granted` event the single-invite path does, so the audit chain treats a bulk batch identically to ten individual invites. No new event type was added — the invariant that every consequential mutation is one append still holds.
v0.3.8API key scope editor — widen / narrow permissions on existing keys without revoke + reissue- Shipped
New scope editor on /api-keys/[id]/usage
Previously API key scopes could only be set at creation — admins who wanted to widen or narrow an existing key had to revoke and re-mint it (and re-distribute the new secret to every consumer). Now four checkboxes (search:read always-on; documents:write / documents:delete / collections:write toggle independently) update the scope set in place. Same Bearer token keeps working with the new permission set on the next call.
- Shipped
Audit-logged via api-key.scopes-changed
Every change emits api-key.scopes-changed on the tenant audit stream with previous + new scope lists (sorted for stable diffs). Idempotent no-op when the requested scope set matches the existing one — accidental Save clicks don't pollute the chain. Same audit pattern as api-key.rate-limit-set from D149.
v0.3.7Saved-search alerts + GA funnel events — subscribe a saved search, get email when new docs match- Shipped
Saved-search alerts at /search/alerts
Mirror of D153 citation alerts but for the saved-search primitive. Subscribe an alert on any of your saved searches; when a newly-extracted doc matches the search's keyword query, Kodori sends an email with a windowed excerpt around the FTS hit. Pause / resume / delete affordances per alert.
- Shipped
Inngest fan-out on document.content-extracted
New saved-search-alerts-dispatcher subscribes to event/appended, filters to document.content-extracted, runs a focused FTS check per active alert via websearch_to_tsquery against document_content.text. Single indexed query per alert; cost scales with alert count not corpus size. Permission-trimmed via canReadDocument against the alert's creator.
- Shipped
Postgres ts_headline excerpt in the email body
The email surfaces a windowed extract from the matched paragraph (~24 words around the FTS hit) so the recipient sees the matching context without opening the doc. Markup-stripped; capped at 240 chars.
- Shipped
GA event instrumentation: sign_in / document_uploaded / agent_question_asked / evidence_packet_generated
Closes the analytics feedback loop wired in v0.3.x. SessionTracker fires sign_in once per browser session via sessionStorage flag; identifyUser sets a hashed user_id (HMAC under AUTH_SECRET) for cross-session stitching without leaking raw UUIDs to GA reports. document_uploaded fires from upload-dropzone after registerUploadedAction succeeds. agent_question_asked fires from agent-chat form submit before the AI SDK delegate. evidence_packet_generated fires from evidence-export-client after PDF download completes.
v0.3.6Citation alerts — subscribe a citation, get an email every time a new doc in the workspace cites it- Shipped
New /citations/alerts management page
Operator types a citation, optional kind filter (case / statute / regulation / etc.), and an email recipient (defaults to your own address; override for paralegal team mailbox). Pause / resume affordance for noisy alerts during heavy discovery cycles. Shows fire count and last-fired date per alert so operators see which subscriptions have actually triggered.
- Shipped
Inngest fan-out on citations.extracted events
New citation-alerts-dispatcher subscribes to event/appended, filters to citations.extracted, loads the doc's citations + active alerts, substring-matches normalized forms, fires emails. Permission-trimmed: each alert's creator must still have read access on the source doc — prevents the alert from being a side-channel for restricted-doc disclosure.
- Shipped
Substring matching on normalized form
A subscription to "347 U.S. 483" fires for "Brown v. Board, 347 U.S. 483, 495 (1954)" parenthetical references — most useful default. Optional kind filter narrows to one of the seven citation kinds when the operator wants only case-cite matches.
- Shipped
Audit-logged via citation-alert.created/.removed/.paused/.resumed/.fired
Five new event types on the tenant audit stream record the alert lifecycle. Each fire includes the matched-citation count + Resend message id. Idempotent re-subscribe (same tenant + normalized + email already active) updates the verbatim query rather than creating a duplicate row.
- Shipped
"Manage alerts →" link on /citations
Header on the cross-tenant citation search page surfaces the alerts management page. Closes the discoverability loop — operators search for a citation, decide they want to track it, click through to subscribe.
v0.3.5Workspace settings page — owner / admin defaults for rate limit, retention, evidence-packet intro, digest reply-to- Shipped
New /settings/tenant page
Single owner / admin surface for four operational defaults that previously punted to "contact support". Closes revisit triggers from D147 (per-tenant default API rate limit), D101 (default retention class auto-applied to uploads), D145 (evidence-packet cover intro), and D150 (digest reply-to override).
- Shipped
Migration 0055 + 4 nullable columns on tenants
default_requests_per_minute, default_retention_class_id (FK to retention_classes ON DELETE SET NULL), evidence_packet_intro (capped at 1500 chars by DB CHECK), digest_reply_to (email-shape CHECK at the DB layer). All nullable — null falls back to global defaults so existing tenants need no backfill.
- Shipped
Resolution chain: per-key → per-tenant → global default
enforceRateLimit now accepts an optional tenantDefault arg. verifyApiKey LEFT JOINs tenants in the auth hot path so the tenant default rides along with the key — no extra round-trip. Per-key override still takes precedence; tenant default is the fallback; global DEFAULT_RPM is the floor.
- Shipped
Evidence-packet cover renders the intro paragraph
When set, the per-tenant evidence_packet_intro renders between the window header and the stats block on the cover page. pdf-lib soft-wraps to fit page width; 1500-char cap prevents overflow.
- Shipped
Digest emails honor reply-to override
sendDigestEmail accepts an optional replyTo. When set, the per-user digest sets the Reply-To header so replies route to the firm's inbound triage inbox instead of bouncing back to hello@kodori.ai.
- Shipped
Audit-logged via tenant.settings-updated
Each save emits one event with a delta object listing the four fields that changed (truncated for the intro). Idempotent no-op when nothing changed — accidental Save clicks don't pollute the chain.
v0.3.4AEC drawing-set integrity check — operator-defined expected sheet ranges + missing/unexpected diff- Shipped
New project_drawing_ranges schema (migration 0054)
Per-project, per-discipline expected sheet ranges with start_major / end_major / optional label. DB CHECK constraints enforce positive majors, end ≥ start, end ≤ 9999. Application-layer cap of 2000 sheets per range so a single accidental entry can't blow out to 10K expected sheets.
- Shipped
Inline integrity-check section on /collections/[id]/drawing-register
Operator types discipline + start + end + optional label, click Add range. Page now surfaces, per range: expected count / found count / missing-major list + a top-of-section completeness chip color-coded green at 100%, amber ≥90%, red below. Range coverage is over major numbers only — "M-1 to M-99" considers M3.05 in-range because its major (3) is in [1, 99].
- Shipped
Unexpected-sheet warning panel
Sheets found in indexed docs but outside every defined range surface in an amber warning panel with the "extend a range or accept as out-of-scope" copy. Computed only when at least one range is defined (otherwise everything is "unexpected" which is noise).
- Shipped
Audit-logged via project-drawing-range.added / .removed
Each range edit lands on the collection's audit stream so admins reviewing the chain can see who added / removed ranges when. Closes a D143 revisit trigger — the third leg of the AEC drawing extraction triad alongside per-doc panel + per-project rollup.
v0.3.3Daily / weekly activity digest emails — opt-in cadence, permission-trimmed sections, audit-logged sends- Shipped
New users.digestFrequency preference
Migration 0053 adds a digest_frequency enum (off / daily / weekly, default off) and digest_last_sent_at column on users. Self-service preference — users opt in via /settings/account → Activity digest. Default off means no surprise emails to existing users on first deploy.
- Shipped
Hourly Inngest cron + per-user send function
digests-tick fires hourly at top of hour, selects users due (daily: 23h elapsed; weekly: 6.5d elapsed AND today is Monday hour=8 UTC), and fans out digest/send.requested events. digest-send consumes one event per user, queries permission-trimmed sections (open @mentions / new docs / new legal holds / retention review depth for admins), renders the email, and updates digest_last_sent_at on success. Failures don't bump lastSent — next tick retries.
- Shipped
Permission-trimmed digest content
Every section runs through canReadDocument before rendering. A doc you can't read never appears in your digest, even if you were @mentioned in it before being walled off. The /mentions inbox surface (D148) and the digest read from the same canonical sources — what you see in the email matches what you see in the app.
- Shipped
RFC 8058 one-click unsubscribe
Each digest carries a List-Unsubscribe header + in-body link. Clicking flips digest_frequency to off via a separate signing namespace (digest-unsubscribe) so an onboarding-tip unsub link can't be replayed against digest. The /unsubscribe page now handles both kinds via ?kind=digest. Transactional emails (invites, billing, audit-chain alerts, @mention notifications) are unaffected.
- Shipped
Audit-logged: digest.sent / digest.failed / digest.frequency-changed
Every send (success or failure) emits an event on the tenant stream with the user, cadence, window, sectionsCount, and (for failures) the reason. Cadence changes from /settings/account also emit digest.frequency-changed with previous + new values. Admins reviewing the chain can see who opted in, who left, what was sent when.
v0.3.2Per-API-key rate-limit editor — set or clear the rpm cap from /api-keys/[id]/usage- Shipped
Editor section on /api-keys/[id]/usage
D147 added the requests_per_minute schema column but no UI to set it — operators could only request a custom cap via support. Now there's a one-form editor at the top of the per-key usage page: number input + Save button + "Reset to tenant default" link. Current effective cap surfaces as a chip alongside the heading. Owner / admin only (same gate as the rest of the page).
- Shipped
Audit-logged via api-key.rate-limit-set
Each cap change emits api-key.rate-limit-set on the tenant stream with previousRequestsPerMinute, newRequestsPerMinute, and the effective cap (the resolved value after applying the tenant default for null). Operators reviewing the audit log can answer "who changed this key's cap and when" without leaving Kodori.
- Shipped
Hard cap of 60,000 rpm (1000 rps)
Rejects fat-finger entries that would effectively disable rate limiting. Customers needing higher throughput contact support — same flow as before, but now requires a deliberate intervention rather than a misclick.
v0.3.1/mentions inbox — persistent surface for every annotation thread where you're @mentioned- Shipped
New /mentions route
Closes the D136 loop: @mentions previously fired one Resend email and that was the only surface — missed/deleted emails meant missed work. The inbox is now canonical. Lists every annotation in the tenant where you appear in mentioned_user_ids, newest-first, with author + doc name + body excerpt + thread state.
- Shipped
Open / Resolved filter chips
Default view shows only OPEN threads (root annotation's resolvedAt is null). For replies that mention you, the panel reads the root annotation's state — replies inherit thread state per the D136 invariant. Switch to Resolved to see closed threads where you participated.
- Shipped
Permission-trimmed via canReadDocument
A doc you've since been walled off from drops out of the inbox even if a mention exists. Backed by the gin jsonb_path_ops index on annotations.mentioned_user_ids from migration 0049 — fast containment lookup at any tenant scale.
- Shipped
Sidebar + mobile nav surface "Mentions"
Sits between Trash and Agent activity in the sidebar. Mobile nav uses "@me" as the short label.
v0.3.0Per-API-key rate limiting with proper 429 envelope + standard X-RateLimit-* headers- Shipped
New requests_per_minute cap on every API key
Migration 0052 adds an OPTIONAL requests_per_minute column on api_keys (NULL = use the tenant default of 600 rpm = 10 rps) and a new api_key_request_counters table tracking calls in the current 1-minute window. Single row per key — when the minute_bucket changes, the count resets to 1 in the same UPSERT. Atomic via Postgres row-level locking on ON CONFLICT DO UPDATE; no race conditions across concurrent requests.
- Shipped
429 envelope with Retry-After and X-RateLimit-* headers
Every v1 + /api/mcp request now passes through enforceRateLimit. When the cap is hit, Kodori returns 429 with Retry-After: <seconds-until-bucket-rolls> and the standard X-RateLimit-Limit / X-RateLimit-Remaining / X-RateLimit-Reset headers — the same shape Stripe and GitHub return. Successful responses ALSO carry X-RateLimit-* so integrators always see their remaining quota.
- Shipped
Cross-runtime FNV-1a hash replaces node:crypto in extractors
D140/D143 used createHash from node:crypto for citation + drawing fingerprints. That broke client bundling once the core barrel transitively reached a client component. New @kumokodo/core/hash exports a pure-JS FNV-1a 64-bit hash that works in Node, Edge, and the browser. Existing rows from D140/D143 keep their SHA-1 fingerprints; new extractions use the FNV-1a form. Re-running with refresh: true rebuilds the index against the new hash.
v0.2.9Tenant-wide citation + drawing search — "every doc citing 347 U.S. 483" or "every doc with sheet A-201" in one query- Shipped
New /citations route — tenant-wide legal-citation search
Type a citation (case name, U.S.C. section, docket number, partial form) and Kodori returns every readable doc citing it across the tenant, ranked by total occurrences with per-doc breakdown. Backed by the existing (tenantId, normalized) index from D140. Empty state surfaces the top-25 most-cited as a starting view. Permission-trimmed.
- Shipped
New /drawings route — tenant-wide AEC sheet search
Sister surface for AEC: type a sheet number ("A-201", "S101", "M3.05") and see every readable doc that references it across every project. Discipline filter chips for trade-specific drilldown. Same shape as /citations but normalized for sheet numbers (dashes/dots/spaces stripped before lookup so "A-201" and "A201" find the same row).
- Shipped
Sidebar + mobile nav surface "Citations" + "Drawings"
Citations between Privilege log and Bates stamp (legal-vertical neighborhood). Drawings between Spec sections and Projects (AEC daily-action neighborhood). Closes the per-doc → per-collection → tenant-wide drilldown trio for both extraction systems.
v0.2.8Compliance evidence packet — one-click PDF for SOC 2 / FRCP audit visits, hash-stamped and audit-logged at generation- Shipped
New /compliance/evidence-export generator
Owner / admin only. Pick a date window + label + section toggles, click Generate, browser downloads a single PDF with: cover page (tenant + plan + window + live/tombstoned/held doc counts + hold count), live audit-chain integrity verification (PASS/FAIL with first-mismatch detail when broken), legal holds list, retention classes, member roster with role-distribution summary, top-40 event types in window with share-of-traffic percentages, and an optional 200-row recent-events table for FRCP discovery exhibits.
- Shipped
Hash-stamped + audit-logged at generation
Each generated packet emits compliance.evidence-packet-generated to the tenant stream with the packet label, window, sections included, byte size, SHA-256 hash of the PDF bytes, and the live audit-chain verify result (ok yes/no). The X-Kodori-Packet-Hash response header surfaces the hash so the operator can copy it into their audit-prep notes. An auditor reviewing the chain can later prove the bytes you produce match the hash recorded at generation.
- Shipped
Quick-range presets (30d / 90d / 12mo / YTD) + section opt-outs
Recent-events section is OFF by default — turning it on inflates the PDF significantly (200 rows × 13pt rows ≈ 4-6 extra pages). Operators can omit any section: e.g. SOC 2 visit needs holds + retention + chain verify; FRCP discovery exhibit needs recent-events + chain verify. Window picker covers UTC dates; presets land the most common ranges in one click.
v0.2.7Bulk-extract citations + drawings — one-click extraction across every readable doc in a collection- Shipped
Two new MCP tools — bulkExtractCitations + bulkExtractDrawings (#72 + #73 in catalog)
Permission-trimmed; idempotent (onlyMissing default true skips already-indexed docs, off-toggle re-extracts everything); capped at 200 docs per call (max 500). Each successful doc emits its own audit event with source: "bulk-extract" + collectionId on the payload — the audit chain attributes per-doc work to the bulk run that triggered it. Continues past per-doc failures so a partial selection still indexes what it can.
- Shipped
"Bulk-extract" buttons on the per-matter / per-project rollups
When the rollup detects unindexed docs in scope ("3 of 12 docs haven't been run yet"), the new BulkExtractRollupButton replaces the old static nudge. Click → Kodori runs the extractor across every readable doc in the collection. Outcome banner reports processed / total citations or sheets / skipped (already indexed / no extracted text / hit cap). "Re-extract already-indexed docs" toggle for full refresh after a new version cycle.
- Shipped
Closes the D140 → D141 → D143 ecosystem
The natural operator workflow: import a folder of briefs into a matter, click Bulk-extract on /collections/[id]/citations, the rollup populates in one shot. Same pattern for AEC drawings on /collections/[id]/drawing-register. Consistent with the existing bulk-source-ops surfaces (D133) and bulk legal hold (D125).
v0.2.6AEC drawing register — extract sheet numbers from drawing PDFs, per-doc panel + per-project register grouped by discipline- Shipped
New extractDrawings MCP tool
Pure-function regex extractor in @kumokodo/core hits the AIA/CSI standard sheet numbering scheme — discipline letter prefix (A, S, M, E, P, C, T, L, F, Q, G, plus multi-letter LS, FP, EQ, IT, AD, SD, MD, ED, GD) followed by 1-3 digit major and optional 1-3 digit minor. New document_drawings table (migration 0051) keyed by (tenantId, documentId, fingerprint) for idempotent upsert. Captures sheet titles when extractable from title-block-adjacent text.
- Shipped
Per-doc Drawings panel on /doc/[id]
Sister panel to the citations panel from D140. Discipline-grouped list (architectural / structural / mechanical / electrical / plumbing / fire / civil / landscape / telecom / equipment / security / demolition / general) with color-coded left borders and occurrence-count badges. "Extract drawings" / "Re-extract" affordance on docs with succeeded extraction.
- Shipped
Per-project drawing register at /collections/[id]/drawing-register
Aggregates every readable doc's sheet index across the project, grouped by discipline. Each sheet row shows sheet number + title + total occurrences across the project + which contributing docs and how many times each. Discipline filter chips for drilling into one trade. Permission-trimmed via canReadDocument. The "drawing register" output every AEC project requires — sheet list by discipline with traceable source documents.
- Shipped
"Drawing register →" link on project pages
Header on /collections/[id] gets a Drawing register affordance for kind=project (alongside Timeline → and Download as ZIP). Mirror of the kind=matter Citations link. Other collection kinds skip the link — drawing extraction is an AEC-vertical signal.
v0.2.5Trash bin at /trash — soft-deleted documents in one place with multi-select bulk restore- Shipped
New /trash listing of every tombstoned document
Tombstoning has always been a soft-delete (status flips to tombstoned, bytes preserved, audit trail intact during the retention window). Until today the only path back was opening the doc by URL or filtering /audit. The new /trash page lists every tombstoned doc in the workspace, newest-first, with sensitivity / size / created-by / created-date columns and a search-by-name input.
- Shipped
Multi-select bulk restore for owners and admins
Checkbox column with a select-all toggle. Reason input shared across the selection. Bulk-restore action calls restoreDocumentTool per-doc, continues past per-doc failures (so a partial selection still recovers what it can), reports succeeded vs failed in an inline banner. Each restore appends document.restored to the doc's audit stream — the prior tombstoning context (who deleted, when, with what reason) stays on the chain.
- Shipped
Sidebar + mobile nav surface "Trash" between Audit log and Agent activity
Operator-asked recovery surface. Doesn't replace per-doc restore — admins still hit the Restore button on /doc/[id] for a single recovery — but covers the "we mass-deleted some emails by mistake last week" case in one screen.
v0.2.4Per-matter citation rollup — every readable doc's citations aggregated and ranked at /collections/[id]/citations- Shipped
New /collections/[id]/citations route
Aggregates every per-doc citation across every readable doc pinned to a collection. Group key is (kind, normalized) so the same citation in five docs collapses to one row with totalOccurrences across the matter and a per-doc occurrence breakdown. Ranked by total occurrences, then alphabetically — answers "what does this matter rely on?" in one view.
- Shipped
Coverage stat + extraction nudge
Top-of-page stat shows "documents with citations / documents in scope" so operators see how much of the matter has been indexed. When less than 100% have run extraction, an inline nudge points operators at the per-doc Extract action with a future-revisit pointer to bulk extraction.
- Shipped
Permission-trimmed via the same canReadDocument gate
A doc the viewer can't read drops from the rollup, even if its citations would otherwise contribute. The rollup never reveals citations from a doc you wouldn't see in search.
- Shipped
"Citations →" link on matter pages
Header on /collections/[id] gets a Citations affordance for kind=matter (alongside Timeline → and Download as ZIP). Other collection kinds skip the link — citations are a legal-vertical signal, not relevant for AEC project drawer aggregations.
v0.2.3Legal citation extraction — seven-kind regex index, per-doc panel, per-matter rollups via the existing event log- Shipped
New extractCitations MCP tool
Pure-function regex extractor in @kumokodo/core hits the seven canonical American legal citation shapes (case / statute / regulation / rule / evidence / docket / constitutional). New document_citations table (migration 0050) keyed by (tenantId, documentId, fingerprint) for idempotent upsert on re-run. Tool emits citations.extracted on the doc stream so the audit log records every extraction run.
- Shipped
Per-doc Citations panel on /doc/[id]
New CitationsPanel under the Annotations panel surfaces the kind-grouped citation index with occurrence counts. Empty state offers the "Extract citations" affordance once extraction is succeeded; "Re-extract" reruns idempotently after a new version. Color-coded left-borders by kind: emerald case, blue statute/regulation, amber rule/evidence, ochre docket, purple constitutional.
- Shipped
Precision-over-recall extractor
Reporter list intentionally case-sensitive to avoid matching prose; docket numbers capped at 8-digit case numbers; subsection patterns (§ 26(b)(1), Rule 803(6)) preserved verbatim in the raw column with a normalized + SHA-1 fingerprint for dedup. Misses on first pass are cheaper than false positives — a citation index with 30 real and 0 noise beats 30 real and 8 noise.
v0.2.2Matter timeline view — chronological narrative across every doc and collection event in one place- Shipped
New /collections/[id]/timeline route
Aggregates every event from every document pinned to a collection PLUS the collection's own events (member-added, member-removed, rule-updated, permission-granted) into one chronological narrative. Permission-trimmed: docs the viewer can't read drop from the doc-event side. Capped at the 250 most-recent events; older history lives on /audit.
- Shipped
Day-grouped timeline with color-coded event tone
Events render in date-grouped sections (sticky day header) with color-coded left borders by event kind: red for DLP/anomaly, amber for retention/legal-hold/purge, emerald for document creation/version/annotation, blue for permission/collection. Each event row surfaces actor + timestamp + linked doc name + payload-hint snippet (reason / matter / sensitivity transition / annotation reply marker).
- Shipped
Event-type filter chips computed from in-page data
Filter chip rendering only includes event types actually present in the loaded slice — operators see "annotation.added · 12" instead of every type from the global enum. Click to drill in; click "All" to clear.
- Shipped
"Timeline →" link on every collection page
Header on /collections/[id] gets a Timeline link alongside the existing Download as ZIP affordance. The matter-grain narrative is the third audit-side surface alongside per-document /doc/[id] history and global /audit.
v0.2.1Per-API-key usage audit panel — daily activity chart, top event types, recent calls list- Shipped
New /api-keys/[id]/usage route
Drilldown view per API key showing total calls, last-30-days call count, last-used stamp, and the daily activity chart (30 1-day bars) sourced from the existing audit log. No new schema — every external API call already lands with actorId="apikey:<prefix>" so the data was always there, just unsurfaced.
- Shipped
Top event types breakdown (last 30 days)
Shows the 10 most-frequent event types this key fired in the last 30 days with raw counts + share-of-traffic percentages. Answers the operator question "what is this integration actually doing?" without forcing a /audit dive.
- Shipped
Recent-calls table with 30-row pagination
Per-call timestamp + event type + stream id, paginated 30 rows per page with Older / Newer arrows. For full payload + actor-kind filtering, the page links to /audit?actor=apikey:<prefix> as the escape hatch — the usage panel is the at-a-glance triage, /audit is the deep dive.
- Shipped
"Usage →" link on every active key in /api-keys
Mirror of the "Deliveries →" link shipped alongside the webhook delivery audit panel. Each integration drillable independently.
v0.2.0Webhook delivery audit panel — per-subscription drilldown with redeliver, status filter, and pagination- Shipped
New /webhooks/[id]/deliveries route
Per-subscription drilldown view replacing the flat last-50 row that lived on /webhooks. Top-of-page stats — total deliveries, succeeded count, failed count, success rate (color-coded green ≥95%, amber ≥80%, red below). All / Succeeded / Failed status filter chips with live counts. Pagination at 25 rows per page (typical mid-market tenant has hundreds-to-thousands of deliveries per active subscription per month — pagination is mandatory). Each failed row surfaces the response status code + truncated response body so the customer can self-diagnose.
- Shipped
Redeliver action on every failed row
After a customer fixes their receiver (502 → 200, signature verification fixed, IP allowlist updated), one click on "Redeliver" enqueues a fresh webhook/deliver.requested event into Inngest scoped to ONE (event × subscription) pair. Doesn't recreate the source event — the audit log stays single-truth — only a fresh webhook_deliveries attempt lands. Refused when the subscription is paused / revoked (resume it first); refused when the source event is in a different tenant. Admin / owner only; same auth bar as create / pause / revoke. Belongs alongside Stripe's "resend webhook" and GitHub's "redeliver" buttons that integrators expect on day-one of any webhook product.
- Shipped
"Deliveries →" link on every /webhooks subscription row
The list page keeps its global last-50 table for the at-a-glance health check. Each subscription row now has a "Deliveries →" link that drills into its own audit panel — every customer integration can be triaged independently without filtering the global table by subscription id by hand.
v0.1.99Annotation threading + @mentions + resolved state — document-side notes graduate into a real review workflow- Shipped
Reply threading on document notes
Annotations gain a parent_id pointer for one-level-deep replies. The /doc/[id] panel renders roots top-level with replies indented under them; "Reply" composer per thread; reply-of-reply rejected at both the DB CHECK constraint and the createAnnotation tool. Threading is annotation-author-agnostic — anyone with read access on the document can reply to any open thread.
- Shipped
@mentions with notification email
Type @someone@example.com in a note body and Kodori parses the mention server-side, validates against tenant membership, persists the surviving list to the annotation row, and fires a Resend email to each mentioned member with the doc name, author, body excerpt, and a deep link to the thread. Live autocomplete in the composer surfaces matching tenant emails. The mention list is recorded on the paired annotation.added audit event so downstream consumers (mentioned-me view, digest) read from the chained log instead of polling.
- Shipped
Resolved / reopen state on root threads
Resolution lives on the root annotation only; replies inherit the thread state. resolveAnnotation tool + reopenAnnotation tool gate to the author, any @mentioned participant, or a tenant admin (writing closes the thread; reopening reverses it). Each emits annotation.resolved / annotation.reopened to the document stream so the audit narrative records who closed and who reopened. UI: muted-with-strikethrough on resolved threads; "Hide resolved" toggle filters them out for an active-work view.
v0.1.98Conflict checking on matter creation — debounced hybrid search surfaces overlapping engagements before you create the duplicate- Shipped
New previewMatterConflicts MCP tool (#67 in the catalog)
Two-pass conflict detection: (1) NAME match — case-insensitive substring against existing collection names (catches "Smith v. Acme — Phase 2" duplicating "Smith v. Acme"); (2) DOCUMENT match — runs the existing hybridSearchTool on the new matter's name + description, then maps each hit back to its collection memberships and surfaces those collections as conflict candidates. Returns each conflict with reason (name-match / document-match / name-and-document-match), member count, and up to 4 doc-hit snippets explaining the link.
- Shipped
Conflict-check panel on /collections/new with debounced live check
When the operator picks kind=matter, a debounced (600ms) conflict check runs as they type the name + description. Results surface inline in an amber-tinted warning panel listing each matter that looks related, with click-through to the suspect matter + per-doc snippets explaining the match. The submit button refuses unless the operator explicitly confirms "I've reviewed the conflicts above and want to proceed anyway." Non-matter kinds (folder / project / cabinet / drawer / custom) skip the check — only matters carry the implicit conflict-of-interest duty.
- Shipped
Replaces the manual pre-engagement conflict check
Every law firm runs a conflict check before opening a new matter — typically a paralegal manually searching the firm's existing matter list for the opposing party. /collections/new now does it automatically with workspace-wide hybrid search. Operator-callable from the agent ("check this matter description for conflicts") via the MCP tool surface. Future revisit: per-tenant adverse-party metadata for stricter conflict signaling beyond name + content match.
v0.1.97OR composition in event-trigger filters — { any: [...] } and { not: ... } group constructors- Shipped
Two new filter node types: { any: [...] } and { not: ... }
Closes the first revisit trigger from D121 (filter expressions were AND-only). The top-level filter array stays AND across nodes; each NODE is now either a leaf condition (D121) or a group constructor. { any: [leaf, leaf, ...] } evaluates true when at least one inner leaf matches — the cross-field OR pattern operators wanted ("severity is high OR sensitivity is restricted"). { not: leaf } negates a single leaf — the "everything except actor=system" pattern. Backwards-compatible: existing leaf-only filters keep working unchanged; new compositions are pure additions.
- Shipped
Constraints kept simple — no nested groups
any-groups contain LEAF conditions only; not-clauses negate a single leaf. No nested groups, no boolean trees, no nested any-of-any. Reasons: (1) the NL compiler stays reliable on a small target language; (2) operator UX scans cleanly ("payload.severity = high OR payload.sensitivity = restricted" reads naturally; deeper nesting reads worse); (3) the dominant pattern is single-level cross-field OR — full boolean trees are over-engineering for the use cases. Future revisit if customers ask for arbitrary tree depth.
- Shipped
NL compiler taught the new constructors
The /automations compile prompt now includes the full filter-node catalog (leaf / any / not) with explicit guidance: use {any: [...]} for cross-FIELD OR (different paths in each leaf); for single-field "one of X or Y" prefer the existing in op (cleaner than {any: [{eq X}, {eq Y}]}). Compile preview renders cross-field OR as "(payload.severity = high OR payload.sensitivity = restricted)" so operators see the boolean structure exactly as it will evaluate.
v0.1.96Bulk operations menu — three new MCP tools + /bulk-ops surface (collection-or-saved-search source-driven)- Shipped
Three new bulk MCP tools (#64, #65, #66 in the catalog)
bulkAddDocumentsToCollection, bulkSetDocumentRetentionClass, bulkSetDocumentSensitivity. All three follow the same shape established by D125 (bulkAddDocumentsToLegalHold): a discriminated-union source (collection or saved-search), permission-trim via userCanReadDocument, idempotent on already-applied docs, ONE event per affected doc (not a single bulk event) so downstream consumers stay uniform. The sensitivity tool also enforces the held-doc downgrade refusal that the per-doc tool does (D54) — refused-held-downgrade counts surface in the result. Source resolution shared across all three tools via a private resolveDocIds helper that matches the canonical pattern from privilege log + Bates stamp + bulk legal hold.
- Shipped
New /bulk-ops page with three-tab operation switcher
One form, three operations. Operator picks the operation (Add to collection / Set retention class / Set sensitivity), the source (collection or saved search), and the operation-specific target. Result counts surface inline ("Added 47 of 50 candidates. 2 already members; 1 no permission") so operators see exactly what happened. Sidebar + mobile nav surface "Bulk ops" alongside API keys / Webhooks. Sister to the existing /search per-doc-id-selection bulk ops (apps/web/app/actions/bulk-document-ops.ts) — that one does "operator picks 50 hits"; this one does "apply to every doc in this source."
- Shipped
Automation-callable via mcp-tool-call action
All three tools are registered in the MCP catalog. Operators can write event-triggered automation rules like "when DLP flags a doc with high confidence, set its sensitivity to confidential across the matching collection" via the existing webhook + mcp-tool-call surface. Variable substitution (D117) + filter expressions (D121) compose with these new tools the same way they do with bulkAddDocumentsToLegalHold.
v0.1.95AEC inspection / daily-report tracker — fourth daily-action surface alongside RFIs / submittals / change-orders- Shipped
New /inspection-tracker page
Auto-classifier already recognized "inspection report" / "daily report" / "punch list" docs (doc-type-hints.ts:211). Migration 0048 adds the aec_inspections projection table; new extract-inspection Inngest function uses Haiku to pull inspector / inspection date / location / spec section / trade / verbatim result / open-finding count / 1-2 sentence findings summary. Page sorts open-with-findings first (the actionable queue), then pass-no-findings, then closed-out, then extraction-failed. Top-line stats: open count, total open findings across all rows, pass count, closed count. Top trades by open finding count surface as a one-line breakdown ("MEP has 27 open findings; structural has 4") so a project executive sees where the issues live without drilling per-doc.
- Shipped
Same architecture as RFIs / submittals / change-orders
Pattern reuse: doc-type-hint detection → auto-classify fan-out (new isInspection regex matching "inspection report" / "daily report" / "punch list", excluding logs / registers / templates) → inspection/extract.requested Inngest event → Haiku extractor → projection → daily-action page. Zero new architectural shapes; this completes the fourth AEC daily-doc workflow Kodori covers natively.
- Shipped
?projectRef= filter same as the other three trackers
Drill-in from /projects/[ref] via the existing per-project filtering pattern. Inline ochre filter pill ("Filtered to project: 24-1234 · clear filter") matches the three older trackers. The /projects rollup page can be extended in a future iteration to count inspection rows alongside RFIs / submittals / COs — the schema is now in place.
v0.1.94RFI + submittal response packet linking — "Mark answered" button on tracker rows + new linkResponseDocument MCP tool- Shipped
New linkResponseDocument MCP tool (#63 in the catalog)
Closes the v2 follow-on noted in D107 (RFIs) + D108 (submittals). One tool handles both kinds; takes (kind: "rfi" | "submittal", subjectDocumentId, responseDocumentId, optional newStatus, optional disposition). Sets the responseDocumentId column + flips status (RFI: open → answered/rejected; submittal: under-review → approved/rejected when status provided, else stays). Permission-trimmed via userCanReadDocument on BOTH the subject (RFI/submittal doc) and the response packet — caller must have read on each. Audit-logged via document.metadata-set events with the field, response doc id, and projection-specific identifiers.
- Shipped
"Mark answered" inline picker on /rfi-tracker + /submittal-tracker rows
Click on any open RFI or under-review submittal row → opens an inline picker. Type a search query (defaults to the artifact's number/subject); hybridSearchTool runs on the workspace; pick a response doc; select status (RFI: answered/rejected; submittal: approved/rejected); optionally type the verbatim disposition text (submittal only); confirm. Calls the new linkResponseDocumentAction server-action wrapper around the MCP tool. Row updates on next render with the new status + responseDocumentId. The pattern that previously required pasting the response doc's URL into a metadata field becomes one click.
- Shipped
Automation-callable via mcp-tool-call action
Because linkResponseDocument is registered as an MCP tool, automations can fire it on event triggers. Operator can write rules like "when an Inngest event indicates an architect-disposition packet was filed (matching pattern X), find the matching RFI by number and link it" — Claude compiles to mcp-tool-call linkResponseDocument with the right args. Pairs with variable substitution + filter expressions for narrowly-scoped auto-linking.
v0.1.93RFI structured spec_section column — replaces /spec-sections heuristic location matching with typed join- Shipped
Migration 0047 adds rfis.spec_section column + extractor populates it natively
Closes the first revisit trigger from D108 + D114 (RFI matching was heuristic substring against the free-text location field). The RFI extractor (extract-rfi.ts) now also extracts `specSection` — a constrained CSI MasterFormat-shaped field — into the new typed column on every RFI ingest. New uploads get structured matching for free; pre-D130 rows fall back to the location-substring path until they re-extract.
- Shipped
/spec-sections joins RFIs by typed column when populated
The page logic prefers the typed `specSection` column when set, falling back to location-substring heuristic only when null. New uploads land in the right bucket directly without depending on the operator typing the section number into the question text. Tighter matching also means fewer false positives (a location like "Sheet A-201" no longer accidentally matches a section like "201").
- Shipped
Backwards-compatible — no breaking change for existing data
Pre-D130 RFIs have null spec_section; the page logic falls through to the existing location-substring matching for those rows. New rows get tighter typed matching. No bulk re-extraction needed; over time as docs naturally re-extract (or via a future re-extract-all admin action) the column fills in. Operators see the typed-vs-fallback distinction transparently — the spec sections still surface on the page either way.
v0.1.92AEC project lifecycle metadata — owner / contract value / target completion / status on /projects/[ref]- Shipped
New aec_projects table + ProjectMetadataCard on the per-project drill-in
Migration 0046 adds aec_projects keyed by (tenantId, projectRefKey) where projectRefKey is the lower-trimmed canonical form. Closes the v2 follow-on noted in D118. The /projects rollup still works without any metadata rows (it keys off the projectRef buckets in the trackers); aec_projects just enriches matching rows when present. Per-project drill-in /projects/[ref] now renders a header card with owner / contract value / target completion / status / notes, with an inline Edit button that flips into a form. Operators can also add metadata to a project that has no artifacts yet — useful for "we just won the bid, set up the project before any RFIs land."
- Shipped
Optional enrichment, NOT a replacement for tracker-derived rollups
The architectural call: metadata is OPTIONAL. /projects rollup continues to derive rows from the three tracker projection tables (rfis, submittals, change_orders) so a doc filed with a brand-new projectRef shows up immediately without any operator setup. Metadata adds owner + contract value + schedule + notes when present. The trade-off (operator typing the metadata once) buys executive-view richness; the rollup-from-trackers gives instant visibility. Both surfaces stay in sync via projectRefKey.
- Shipped
upsertAecProjectAction with ON CONFLICT merge
Server action handles both create (new projectRef) and edit (existing). Conflict target is (tenantId, projectRefKey) so multiple operators editing the same project converge on the latest write. Status is constrained to active / on-hold / closed. Currency is a small enum (USD / CAD / EUR / GBP / AUD); contract value stored as bigint cents for precision. Future revisit: per-tenant currency default + custom statuses.
v0.1.91Public share links — tokenized read-only URLs for external recipients (closes the discovery production loop)- Shipped
New share_links table + /share/[token] public viewing surface
Tokens are 32 bytes of crypto randomness rendered as base64url; only the SHA-256 hash is stored. Plaintext token is shown ONCE at creation (operator copies the URL, sends it to opposing counsel, never sees it again). Default expiration is 14 days; max 90. The viewing surface bypasses Auth.js — middleware allows /share/[token] through with the token-as-auth pattern. Document / collection / production share links each render a recipient-friendly download view with no Kodori sidebar / nav / branding beyond a footer attribution.
- Shipped
Production-source share links serve the EXACT recorded bytes
When the operator shares a recorded production (instead of a collection or single doc), the per-doc download routes use each doc's captured versionHash from production_documents — not the doc's currentVersionHash. So a recipient downloading from a "Production Set 1 — Jan 15" share link gets the bytes that were ORIGINALLY produced, even if subsequent re-stamps changed the doc's current version. Chain-of-custody preserved across the full production → share → recipient-download path.
- Shipped
Inline "Share via link" button on /doc/[id] + /productions/[id]
New <ShareLinkButton> component. Operator clicks Share via link → form opens with optional label + recipient hint + expiry-days; click Create. The plaintext URL surfaces ONCE in a green confirmation box with a Copy button; after dismissal the URL can never be retrieved (only the hash is stored). Token prefix surfaces in the confirmation for at-a-glance distinguishing in the /share-links list.
- Shipped
New /share-links admin surface + three audit-event types
New /share-links page lists every active / expired / revoked link with status badge, recipient hint, access count, last-access timestamp, and a Revoke button. Three new event types: share-link.created, share-link.accessed (anonymous public hits, actorId="public-share:<prefix>"), and share-link.revoked. Webhook subscribers + automations + the /audit page see share-link lifecycle as first-class events — operators can write rules like "when a share link is accessed, post to my Slack channel" via the existing webhook + variable-substitution surface.
- Shipped
Closes the discovery production loop end-to-end
Privilege log v2 (D119/D122) classifies what to withhold; redact (D123) burns black rectangles; Bates stamp (D124) labels what's being produced; productions tracker (D126) records the production event with version-hash capture; matter binder (D127) compiles the deliverable; and share links (D128) deliver the package. Operator never leaves Kodori. Every step audit-logged. Chain-of-custody story is one paragraph long.
v0.1.90Matter binder export at /matter-binder — one-click compile a collection or production into a single bookmarked PDF- Shipped
New /matter-binder surface
Pick a source — collection (matter, custom folder) or recorded production — and Kodori merges every PDF in the source into a single binder with a cover page (matter name + doc count + Bates range + generated date) and a table of contents listing each doc with its Bates BEG/END + page count. Replaces the "drag every PDF into Acrobat and run Combine" pre-production workflow.
- Shipped
New /api/matter-binder POST route streams the PDF bytes
Auth-gated route handler accepts JSON body with source + optional matterName, returns application/pdf bytes with Content-Disposition: attachment so the browser downloads cleanly. pdf-lib (already a dep from D123) does the merge: cover page is a US Letter PDF page with centered display copy via StandardFonts.HelveticaBold; TOC is rendered with Helvetica regular + truncated displayName per row when long. Permission-trimmed before merge — docs you can't read skip silently. Non-PDFs skip silently. Cap: 500 docs / 200MB output.
- Shipped
Production-source preserves the EXACT bytes recorded
When the source is a production, the binder uses each doc's captured versionHash from production_documents — not the doc's current version. So a binder built from "Production Set 1 — Jan 15" delivers the identical bytes that opposing counsel got, even if subsequent re-stamps changed the doc's current version. Collection-source uses the doc's currentVersionHash (the typical "build me a working binder of this matter" use case).
v0.1.89Production set tracker /productions — every Bates batch logged as a discovery production with version-hash capture- Shipped
New /productions surface listing every recorded production
Migration 0044 adds the `productions` + `production_documents` tables. Each production captures recipient, matter, date, Bates range, document count, and the EXACT version hash of every doc that was delivered (so a later re-stamp on the same doc doesn't retroactively change the production record). Click a row to drill into the per-production page with per-doc Bates BEG/END + page count + version hash. Docs whose current version differs from the recorded one show an "archived" badge so the operator knows they're looking at the produced bytes, not the latest.
- Shipped
New recordProduction MCP tool (#62 in the catalog)
Permission-trims via userCanReadDocument; inserts the production + production_documents rows + a production.recorded audit event in one transaction. Documents the operator can't read drop out (counted in skippedDocs). Callable from the agent, automations (mcp-tool-call action), external MCP clients (Claude Desktop / Cursor / Kodokyo), and the new "Record as production" form on /bates-stamp.
- Shipped
"Record as production" affordance on /bates-stamp results
After stamping, the result table grows a "Record as production" details panel — operator types a name, recipient, matter ref, optional notes, and clicks Record. The just-stamped version hashes flow directly into the production record so the production captures EXACTLY what was stamped. Closes the loop between Bates stamping and discovery production logging in one click.
- Shipped
New production.recorded audit event
Added to EventTypeSchema. Webhook subscribers + automations + the /audit page see production records as first-class events; an operator can write an event-triggered automation like "when a production is recorded for matter Smith v Acme, post to my Slack channel" via the existing webhook + variable-substitution surface.
v0.1.88Bulk legal hold — apply a hold to a whole collection or saved search in one click- Shipped
New bulkAddDocumentsToLegalHold MCP tool
Tool #61 in the @kumokodo/mcp catalog. Takes a hold ID + a source (collection or saved search) + an optional limit; resolves the source via the existing collection-members or runSavedSearchTool path; permission-trims via the same userCanReadDocument check the per-doc tool uses; pre-loads existing memberships so already-held docs are counted (alreadyHeld) instead of re-emitted; inserts the new memberships in one transaction; emits one legal-hold.applied event per newly-added doc — same audit shape as the per-doc tool. Returns counts: added / alreadyHeld / skippedNoPermission / skippedNotFound / totalCandidates.
- Shipped
Bulk apply form on /legal-holds/[id]
New section on the legal-hold detail page (between the candidate finder and the add-by-id form). Operator picks a source — collection or saved search from the workspace's lists — and clicks Bulk apply hold. Result counts surface inline (newly added / already held / no permission / not found / total candidates) so the operator knows exactly what happened. Idempotent on already-held docs: re-running the same source against the same hold reports alreadyHeld: N and adds nothing.
- Shipped
Automation-callable via mcp-tool-call action
Because it's registered as an MCP tool, the bulk-apply is also addressable from /automations. An operator can write an event-triggered rule like "when a doc is filed in the Smith Matter collection, apply the Smith litigation hold" — Claude compiles it to mcp-tool-call bulkAddDocumentsToLegalHold with args { legalHoldId, source: { kind: "collection", collectionId } }. Combined with variable substitution + filter expressions, an automation can scope by sensitivity / actor / etc. before applying.
v0.1.87Bates stamping batch at /bates-stamp — page-bottom Bates onto every PDF in a production set- Shipped
New /bates-stamp surface for discovery production
Operator picks a source (collection or saved search) + a Bates prefix + a starting number; Kodori loads each PDF in alphabetical order, stamps the bottom-right of every page with a sequential Bates number via pdf-lib's drawText, and saves each result as a new immutable document version. Pairs with /privilege-log v2 — operators using the same prefix + start across both surfaces get log-row Bates numbers that match the produced PDFs' starting Bates exactly. Standard e-discovery convention: each doc gets a contiguous range [BEG, END]; the privilege log shows BEG; the produced PDFs carry the full per-page range.
- Shipped
Thin white-pad behind the Bates label
pdf-lib stamps a 0.85-opacity white rectangle behind every Bates number before drawing the text on top. Keeps the number legible over dark or busy page footers (charts, photos, tables that extend to the bottom edge) without obscuring the underlying content.
- Shipped
Result table shows BEG / END / pages / doc-link per row
After stamping, the operator sees a table of every doc that was stamped with the Bates BEG, BEG + pages - 1 = END, page count, and a link to /doc/[id] showing the new version. Final cursor surfaces in the heading so the operator knows where to start the next batch (typical pattern: stamp matter A starts at 1; matter B starts at A's final + 1).
- Shipped
New event-payload reason="bates-stamped" on document.version-committed
Each stamp fires the existing document.version-committed audit event with reason="bates-stamped", batesPrefix, batesBeg, batesEnd, and pageCount in the payload. Webhook subscribers + the /audit log surface stamped versions naturally without a new event type. A future "produced to whom?" report joins these events with recipient metadata.
v0.1.86PDF redaction tool — draw boxes on a PDF, burn to a new immutable version- Shipped
New /doc/[id]/redact surface for legal discovery production
Click "Redact" on any live PDF document. The redaction surface renders every page via PDF.js with a transparent overlay where the operator click-and-drags rectangles. Each saved box persists immediately to the new document_redactions table; a × button on each box removes it. The "Burn redactions to new version" button uses pdf-lib to overlay opaque black rectangles + flatten + creates a new immutable document version pointing back to the original via previousHash. Original version is preserved in document_versions; the burn event records the box list as audit metadata so a future investigator can answer "what was redacted" without ever recovering "what was behind".
- Shipped
Resolution-independent box coordinates
Redaction boxes are stored in PDF user-space units (1pt = 1/72 inch), not client pixels. The canvas overlay converts mouse positions through PDF.js's viewBox scale on the way in, and renders saved boxes back through the same scale on the way out — re-renders at any zoom level continue to align. pdf-lib's bottom-up Y coordinate is reconciled with PDF.js's top-down Y on both the save path and the render path.
- Shipped
Audit chain captures every redaction lifecycle event
New event types document.redaction-added and document.redaction-removed land on the existing hash-chained audit log every time the operator draws or removes a box. The burn fires document.version-committed with reason="redactions-burned", redactionCount, and the full box-list as payload. A future investigator can produce the exact audit trail of who saw the original and who saw which redacted version.
- Shipped
Pairs with privilege log v2 for the discovery production loop
The legal discovery production workflow is now end-to-end: privilege log v2 classifies what to withhold; the redaction tool removes privileged content from docs that need partial production; the next iteration (Bates stamping batch) adds production-stamped Bates numbers. All three surfaces share a single audit-defensible pipeline — the chain-of-custody story is one paragraph long.
v0.1.85Privilege log v2 — saved-search source + per-row inline editing with persistent overrides- Shipped
Source = collection OR saved search
Closes the first revisit trigger from D119. /privilege-log now accepts either a collection (existing) or a saved search as the source. The form has a Source toggle; saved-search variant calls the existing runSavedSearch MCP tool to expand the query into matching documents, then runs the same Haiku classification pipeline. Saved-search-source builds always re-run the search (ensuring the build reflects current state), so a search like "all docs Haiku tagged as attorney-client correspondence" stays self-updating across builds.
- Shipped
Per-row inline editing with persistent overrides
Closes the second revisit trigger from D119. Every row now has an "edit" link that toggles inline form fields (Bates input, basis dropdown, description textarea); save persists to the new privilege_log_overrides table keyed by (tenantId, sourceKind, sourceId, documentId). Subsequent builds for the same source apply the override on top of the Haiku-classified row, so re-builds preserve operator corrections without re-classifying. Saved overrides skip the Haiku call entirely on re-build (cost-saver). "edited" badge surfaces in the Bates column when a row has any saved override; "clear override" button reverts the row to the latest Haiku output.
- Shipped
Migration 0042 — privilege_log_overrides table
Stores per-row overrides keyed by (tenantId, sourceKind, sourceId, documentId) with a unique index. sourceKind is plain text per the post-D112 / D115 standard. ON DELETE CASCADE on tenantId + documentId so a deleted document drags its overrides with it; saved searches and collections that disappear leave their overrides as orphans (the next build won't pick them up since the source resolves to nothing).
v0.1.84Filter expressions on event triggers — fire only when severity is high, when sensitivity is restricted, when the agent acted- Shipped
Optional filter list on event-kinded automation triggers
Closes the first revisit trigger from D112 + D117. Operators can now compile rules like "When an anomaly is detected with severity above medium, ping my Slack webhook" — Claude picks the event trigger AND attaches a filter ([{ path: "payload.severity", op: "in", value: ["high","critical"] }]). The dispatcher narrows by eventType first, loads the matched event row once, then applies the filter conditions in-memory before invoking the runner. Automations without a filter pass unconditionally — preserves existing behavior.
- Shipped
8 ops, dotted-path access, ANDed conditions
Recognized ops: eq, neq, in, nin, gt, gte, lt, lte, contains. Path access: payload.<field> for jsonb payload fields, plus top-level columns (eventType / actorKind / actorId / streamId / streamVersion / tenantId). Conditions are ANDed only — OR is achievable via the in / nin ops, which covers the common "severity is one of high or critical" pattern without growing into a full expression language. Number-string coercion lets "5" match a numeric 5 so operators don't have to remember the payload's precise field type.
- Shipped
NL compiler taught the filter syntax + canonical translations
System prompt now includes the filter shape, recognized paths, ops, and several "translate this English to a filter" examples (severity-above-medium → in: high/critical; sensitivity-restricted+ → in: restricted/regulated; agent-only → actorKind eq agent). The compile-preview pane renders the filter inline ("on anomaly.detected where payload.severity in [high, critical]") so operators see exactly what they're saving before clicking Save.
v0.1.83AEC project drill-in /projects/<ref> + ?projectRef= filters on the three trackers- Shipped
New /projects/<ref> drill-in page — one screen per active job
Click any row on /projects to land on a per-project view: header stats (open RFIs / open submittals / pending COs / executed CO impact + pipeline + schedule days), then sectioned by Open RFIs, Under-review submittals, Pending change orders (with PCO-overdue badging), Recently executed COs, Rejected COs, and a unified recent-activity timeline mixing all three artifact types. Three "Filtered tracker →" buttons jump to the matching tracker pages with the project pre-filtered.
- Shipped
?projectRef= query-string filter on /rfi-tracker, /submittal-tracker, /change-order-tracker
Each of the three trackers now accepts ?projectRef=<ref> and case-insensitive trim-matches the SQL where clause. An ochre filter pill renders inline ("Filtered to project: 24-1234 · clear filter") so operators always know they're in a scoped view. Without the param, behavior is identical to before.
- Shipped
Closes the v2 follow-on noted in D118
D118 explicitly called out per-project drill-in as the next iteration. This shipment closes that gap. The drill-in page is built as a server component with three independent SQL reads (one per tracker) using inline lower(btrim(...)) match expressions — no helper plumbing, no N+1 query, no migration.
v0.1.82Privilege log builder at /privilege-log — FRCP-26 log in seconds, not paralegal hours- Shipped
New /privilege-log surface for legal discovery production
Operator picks a collection (matter, custom folder, anything pinned with the docs being withheld), Kodori generates an FRCP-26-compliant privilege log: Bates / Date / Author / Recipients / Doc Type / Privilege Basis / Description. The classifier picks from a constrained enum (Attorney-Client, Attorney Work Product, Common Interest, Joint Defense, Settlement Negotiations, Not Privileged, Unable to Determine) and writes a 1-2 sentence FRCP-26-style description that names subject + parties + date without reproducing privileged content. Replaces the standard paralegal workflow ("read each doc, classify, write a description") that runs ~8-12 hours of senior paralegal time on a 200-doc matter.
- Shipped
Auto-sequenced Bates numbers, configurable prefix
Bates numbers auto-sequence from a configurable prefix (default "KOD") and start number, padded to 6 digits — KOD000001, KOD000002, etc. Operators on existing Bates schemes can edit the prefix in the form before building. Future revisit: per-tenant default prefix + cross-build collision detection.
- Shipped
Permission-trimmed before classification
Documents the requester can't read drop out before any Haiku call. The skipped count surfaces above the table so operators know whether their build covered the full collection or stopped at their access boundary. Deny-wins exactly as everywhere else in the product — a privileged doc the senior partner can read won't appear in a junior associate's privilege log.
- Shipped
Markdown export ready for paste into Word, ECF, or production cover letter
Download .md button emits a GitHub-flavored Markdown table that pastes cleanly into Word (which converts to a real table), into ECF filings, into Slack threads, or into a production cover letter. PDF export with firm letterhead is v2; Markdown is the universally portable starting point.
v0.1.81AEC /projects dashboard — every active job's health on one screen, sister to /spec-sections- Shipped
New /projects heatmap pivoted by project ref
Sister surface to /spec-sections, but pivoted by project reference instead of CSI section. Per row: open / total RFIs (status + dueAt-aware), under-review / total submittals (requiredAt-aware), pending / executed / rejected COs, executed cost impact (signed dollars, red for additive, emerald for credits), pipeline cost impact, executed + pipeline schedule days. Top-line stats: project count, total open RFIs, total open submittals, total pending COs, project-wide executed cost impact. Sort: hottest project first by total open work units across the three artifact types — a project executive sees what to focus on without drilling into individual trackers.
- Shipped
Schedule-day aggregates per project — executed vs pipeline
New right-most column shows the cumulative schedule impact (signed days) of executed COs vs. pipeline (PCO + pending) COs. Real construction projects accumulate weeks of additive schedule from change orders; aggregating per project answers the "is this job behind because of executed change orders OR pending unresolved scope?" question without manual spreadsheet math.
- Shipped
No project schema, no migration — pure read-side join
projectRef is already extracted into each tracker projection (rfis, submittals, change_orders) by the existing Haiku extractors. /projects is a pure aggregation over those three tables; no project lifecycle table, no migration, no risk of the view going stale relative to the source trackers. The audit log is still source-of-truth and the page is always live.
v0.1.80Variable substitution in automation actions — ${event.payload.documentId} resolves at fire time- Shipped
${event.payload.<field>} placeholders inside mcp-tool-call args, webhook URL/message, agent-query prompt
The missing piece that makes event-triggered automations actually act on the document that triggered them. "When a doc is filed, set its sensitivity to confidential" now compiles to mcp-tool-call setDocumentSensitivity with args { documentId: "${event.payload.documentId}", sensitivityLabel: "confidential" }; at fire time the resolver substitutes the real document id from the matched event payload before the runner Zod-parses + invokes the tool. Recognized placeholders: ${event.payload.<field>}, ${event.eventType}, ${event.eventId}, ${event.actorId}, ${event.firedAt}, ${automation.id}, ${automation.name}, ${trigger.source}, ${trigger.firedAt}.
- Shipped
Whole-string placeholders preserve type fidelity
A whole-string match like "${event.payload.amountCents}" resolves to the underlying value (number, boolean, object) — not a stringified version. Tools that need a number argument get a number; mid-string placeholders ("Hello ${event.actorId}") still concatenate as strings. Type fidelity matters for tool args that go through Zod schemas expecting non-string types.
- Shipped
Compile-time tolerant validation
Zod compile-time validation now substitutes placeholder strings with a UUID sentinel before parsing, so the dominant pattern of "${event.payload.documentId}" doesn't false-fail on z.string().uuid() at compile time. Runtime re-validates with the resolved values; missing-path placeholders render as `[unresolved: path]` which Zod cleanly rejects with a path-specific error.
- Shipped
Event dispatcher loads the matched event row for substitution context
The event/appended Inngest event carries only tenantId + eventId + eventType. The dispatcher now also loads the events row for the eventId before invoking the runner so ${event.payload.<field>} can resolve against the actual payload — including actorId, actorKind, streamId, streamVersion, and the full payload object the originating mutation appended.
v0.1.79MCP-tool-call action on /automations — invoke any of the 60+ typed tools as the rule's payload- Shipped
Fourth action kind on /automations: mcp-tool-call
Closes the second v2 commitment from D110 + D112. The most powerful action so far — a rule names any tool from the @kumokodo/mcp catalog plus a JSON args object, and the runner invokes it permission-trimmed as the rule's creator. Combined with event-based triggers, this is "Zapier inside the DMS": "when DLP flags a doc, set its sensitivity to confidential", "when a doc is filed in Matter X collection, apply the matching legal hold", "when an anomaly auto-pauses the agent, create an annotation on the affected doc with the incident ID". The compiler is taught the full tool catalog (each tool's name + 1-line description) and validates the picked tool exists + args parse against the tool's Zod input schema BEFORE save, so misconfigured rules surface in the preview rather than on the next fire.
- Shipped
Direct dispatch — no agent tool-loop round-trip
Once a rule is compiled, the action knows exactly which tool + args to invoke. The runner calls `tool.handler(args, ctx)` directly with the creator's actor context — skipping the agent's tool-loop saves a Claude call per fire. Tool errors surface verbatim in the lastFireResult; tool results are JSON-stringified into the result summary (truncated at 480 chars) for visibility on the automation row.
- Shipped
Compile-time tool + args validation
The compile flow runs three checks before letting an mcp-tool-call rule save: (1) toolName is in the catalog, (2) args parse against the tool's Zod input schema, (3) the existing per-action audit. Errors surface in the preview pane with the exact Zod path + message ("documentId: invalid uuid", "sensitivityLabel: must be one of …"). Operator can re-describe the rule rather than discovering the failure on the next fire.
v0.1.78Webhook action on /automations — fire a JSON POST when a rule matches (Slack-aware)- Shipped
Third action kind on /automations: webhook
Closes the v2 commitment from D110 + D112. Operators can now compile rules like "When a legal hold is applied, POST to my Slack webhook URL https://hooks.slack.com/... with the message 'new hold filed.'" — Claude picks the webhook action when the description mentions a URL, optionally extracts an inline message, and the runner POSTs a structured JSON payload (source / ruleId / ruleName / triggerSource / message / firedAt) to the configured HTTPS URL when the trigger fires. 5-second timeout, fetch with X-Kodori-Automation + X-Kodori-Trigger-Source headers. Slack-aware: when the URL host is hooks.slack.com the body is auto-rewritten to Slack's {text} shape so a Slack incoming webhook surfaces a readable message rather than raw JSON; any other host POSTs the canonical Kodori shape.
- Shipped
No signing key on automation webhooks — by design
For HMAC-signed delivery, customers use the existing /webhooks subscriptions surface (broader, fires on every event/appended for the configured eventTypes). Automation webhooks are URL-as-secret targets — Slack incoming webhooks, Discord webhooks, n8n / Make.com / Zapier triggers, customer-side Lambdas. Operators using these point at endpoints they trust, and the URL itself is the secret. Future revisit if a customer needs HMAC on automation-driven webhooks → reference an existing webhook_subscription's signing key by ID.
- Shipped
action_kind column converted from enum to text — additive action kinds without migration drama
Migration 0041 mirrors the trigger_kind switch from D112: Postgres ALTER TYPE ADD VALUE can't run inside a transaction block, so adding new automation action kinds via enum extension is migration-fragile. Switching to plain text + TypeScript narrowing means the next action additions (mcp-tool-call lands next) are pure-TS additions to the union + a new branch in the runner.
v0.1.77AEC /spec-sections directory — joins RFIs, submittals, and change orders by CSI MasterFormat section- Shipped
New /spec-sections heatmap surface
Project-level "where are reviews piling up?" view that joins all three AEC daily-action surfaces by CSI MasterFormat spec section. Per row: open / total RFIs (matched by location text containing the section number); under-review / total submittals; pending / executed / rejected change orders; executed cost impact; pipeline cost impact. Sections sort hottest-first (most open work units across all three artifact types). Overdue badges surface on any section with overdue RFIs, overdue submittals, or PCOs older than 14 days. Top-line stats: number of sections with activity, total open RFIs, total open submittals, total pending COs, project-wide executed cost impact (red for additive, emerald for credits). Closes the §15 future-work commitment from D108 + D111.
- Shipped
Whitespace-tolerant section matching
CSI section numbers appear both spaced ("08 41 13") and unspaced ("084113") in the wild — submittal type and change order docs use either, and RFIs free-text location can use either. The page normalizes by stripping whitespace before bucketing, so both forms aggregate into one row, with the display form preserved as whichever the source document used.
v0.1.76Voice notes on /capture — dictate, Whisper transcribes, the agent files- Shipped
Record voice notes from a phone or laptop, file as a Kodori document
New voice-note recorder on /capture sits alongside the photo capture tile. Tap "Record voice note" — browser shows the mic-permission prompt the first time, MediaRecorder captures audio (audio/webm on Chrome/Firefox/Android, audio/mp4 on iOS), and a live duration counter shows recording progress. Tap Stop, review duration, tap "File + transcribe". The audio uploads through the same presigned-URL + content-addressable-blob pipeline as photo captures and PDF uploads — the audio file IS the document, with full SHA-256 dedup, retention class, and audit-log coverage.
- Shipped
New whisper-transcribe extractor — audio bytes become searchable, classifiable text
Added to the workflow extractor registry between the Office adapters and the cloud-OCR fallbacks. Self-reports unsupported until OPENAI_API_KEY is configured (graceful no-op on dev workspaces without the key). Routes any audio/* MIME type — plus application/ogg — to OpenAI's Whisper API at $0.006/min, returns the transcript as the document's extracted text. The downstream auto-classify pipeline picks up the transcript exactly the same way it picks up extracted PDF text, so a voice note gets a 3-sentence aiSummary, a sensitivity label, a collection suggestion, keyword tags, and a docType automatically — every voice note shows up on /dashboard's recent-docs list with all the same context as a typed memo.
- Shipped
Cost gate extended to cover Whisper transcription
The existing pdf.extract quota bucket (which already covers claude-pdf and illustrator-ai) now also covers whisper-transcribe. A free-tier tenant uploading 50 hours of dictation hits the same monthly cap that 100 PDFs would. The quota name stays "pdf.extract" for historical compatibility — it's the generic "expensive AI extraction" bucket, not just PDFs.
- Shipped
Verbose Whisper response preserved as structured metadata
Extractor requests Whisper's verbose_json response_format so the segments + per-segment timestamps + detected language land in documentContent.structured. v1 surfaces only the transcript text + language; the segment data is preserved for a future "search hits jump to a timecode" UI without needing a re-transcription.
v0.1.75Event-based automation triggers — fire when a matching audit event lands, not just on a cron- Shipped
New "event" trigger kind on /automations
Operators can now compile rules like "when a legal hold is applied, ask the agent to summarize the matter scope and email me + counsel@firm.com" or "when DLP flags a doc, ask the agent what obligations apply and email me a recommended action." Claude is taught the full event-type catalog (document.created, anomaly.detected, legal-hold.applied, ap-invoice.approved, document.dlp-flagged, etc.) and picks the right one from the operator's natural-language description. Subscribed via a new automations-event-dispatcher Inngest function listening on the universal event/appended channel — fires within seconds of the event landing on the audit log, not on a 5-minute cron.
- Shipped
Reuses existing event/appended infrastructure end-to-end
The event-store wrapper has been firing `event/appended` Inngest events on every successful append for the webhook fan-out path; the new dispatcher subscribes to the same channel. No new event flow added — the audit log stays source-of-truth, the automation is a courtesy reaction the same way webhook delivery is. Permission-trimming applies identically (action runs as the creator, deny-wins).
- Shipped
trigger_kind column converted from enum to text — additive trigger kinds without ALTER TYPE pain
Postgres ALTER TYPE ADD VALUE can't run inside a transaction block, so adding new automation trigger kinds via enum extension is migration-fragile. Migration 0040 converts the column to plain text and the TypeScript layer narrows it to a literal union ("scheduled" | "event"). Future kinds (webhook-fanout-trigger, mcp-tool-call-trigger, billing-threshold-trigger) are now plain TS additions.
v0.1.74AEC change-order tracker — see every PCO and CO with cost + schedule impact in one screen- Shipped
New /change-order-tracker daily-action surface
Third AEC daily-action workflow alongside RFIs and submittals. Auto-classify watches every uploaded doc; when the title or first chunk matches "change order", "PCO", "potential change order", "construction change directive", or "CCD" (and is not a log/register/template), the doc is fan-out routed to a Haiku-backed extractor that parses CO number, subject, project ref, spec section, originator, approver, signed cost impact (additive vs deductive), schedule impact in days, reason category, dates, and signature status. Status auto-derives: PCO until executed, then "executed" once a signed copy hits, "rejected" if the doc says rejected. Page surfaces five status counters, executed-vs-pipeline cost aggregates (red for additive, emerald for credits/savings), schedule-day aggregates, top reason categories, and per-bucket sections (Pending → PCO → Extraction failed → Recently executed → Rejected) with overdue-signature highlighting on PCOs older than 14 days. No "change order log" spreadsheet to maintain — every PCO and CO in the workspace surfaces here automatically.
- Shipped
Signed-cost handling: additive and deductive change orders are first-class
Real construction change orders include both additive ("add $42K for revised structural steel") and deductive ("credit $8K for unused contingency") line items. Extractor parses the signed amount and the projection table stores cents as a signed bigint. UI uses +/− prefixes plus tone (red = additive cost, emerald = savings/credit) so a project manager glancing at the page sees net pipeline impact without doing the math.
- Shipped
Reason-category aggregation for the pipeline
Top reason categories ("owner request", "design change", "field condition", "RFI response", "code requirement", "weather impact", "schedule extension", "value engineering", "scope clarification", "errors and omissions", or "other") roll up across pending + PCO change orders. Project teams see at a glance whether the pipeline is dominated by owner scope changes vs. design errors — direct input to claim-prep and root-cause conversations.
v0.1.73Smart agent automations — programmable agent in plain English at /automations- Shipped
Type a rule in plain English; Claude compiles it; cron fires it
New /automations surface (owner / admin only) lets operators type natural-language rules ("every Monday at 8am, run my 'unfilled retention' saved search and email me the new hits"). Claude Opus compiles the rule into a typed trigger + action config; the operator confirms the compiled output before saving. An Inngest cron tick runs every 5 minutes, selects every enabled automation whose schedule says it should fire, runs the action, persists the result. v1 trigger types: scheduled only; v1 action types: email-saved-search and email-agent-query.
- Shipped
email-agent-query — programmable agent assistant
The wow-factor action. Operator types "daily at 9am, ask the agent which AP invoices have price variance and email me the answer" — every morning at 9am UTC the action fires, runs the question through the hybrid search + Claude Opus, and emails the response. Programmable agent without writing code; no incumbent has the architecture (typed-tool agent + scheduled cron + email infra) to ship this.
- Shipped
email-saved-search — closes the §15.2 deferred saved-search digest item
Operator picks an existing saved search ("unfilled retention", "AP variance", "open RFIs on Project Alpha") and a schedule. Action runs the saved search via the existing runSavedSearch MCP tool, formats hits as markdown, emails the digest via Resend. Closes the long-deferred Phase-2 saved-search-email-digest commitment with a more general primitive.
- Shipped
Run-now button on every automation row
Trust-builder for the schedule. Operators want to test a rule before trusting it on a recurring schedule; the Run now button executes the same code path the cron tick uses, surfaces the result inline ("✓ Sent 1-recipient digest with 7 hit(s) (manual)"). No waiting for the next 5-minute tick to verify a config change.
- Shipped
Permission-trimmed: automations run as the creator
Each automation's tool calls run with the creator's actorId, so per-document permissions apply as if the creator were running the search themselves. A creator who lost access to a doc after creating an automation will stop seeing it in their digests — that's the right behavior. Deny-wins is preserved through the entire scheduled flow.
- Shipped
agent_automations table + scheduled tick (migration 0038)
New schema with discriminated-union trigger config (every-N-minutes vs daily-UTC with optional weekday) + discriminated-union action config (email-saved-search vs email-agent-query). Idempotent firing via lastFiredAt + per-schedule "should fire now?" pure function. Fire counts + failure counts + lastFireResult JSON persisted on every run for the audit + UI display.
v0.1.72AI document summaries — Haiku-generated 3-sentence summary on every ingest- Shipped
3-sentence summary on /dashboard recent-docs cards
Auto-classify pipeline grew an aiSummary field alongside the existing sensitivity / collection / keywords / docType outputs. Haiku writes a 3-sentence summary in the same call (no extra round-trip): first sentence = what it is + parties / matter / project; second sentence = the substantive content (the deal, question, dispute, figure); third sentence = the operative date / deadline / next action. Surfaces below the document name on every dashboard recent-docs row, so operators see "what is this?" at a glance without opening the file.
- Shipped
AI summary card on /doc/[id] header
A dedicated ochre-tinted callout above the legal-hold + preview blocks renders the summary verbatim. Visible to every reader of the document. The full-page preview still loads underneath; the summary is the orientation, the preview is the substance.
- Shipped
documentContent.aiSummary column (migration 0037)
New text column on document_content storing the summary capped at 600 chars. Non-versioned (replaced on re-classify); model-derived (returns null for sparse text). The summary lives on document_content rather than as a metadata-suggestion row because it's descriptive context, not a mutation that changes how the doc is filed.
- Shipped
Cost-flat — same Haiku call, one extra output field
No extra LLM round-trip. The existing classify call now returns sensitivity + collection + keywords + docType + aiSummary in one structured output. Cost stays roughly flat (output tokens grow by ~150 per document); the per-document daily-use win is high-visibility.
v0.1.71AEC submittal tracker — sister to /rfi-tracker for product / material approvals- Shipped
New submittals projection table + extract-submittal Inngest function
When the auto-classifier flags a doc as submittal (the existing doc-type-hint pattern, also covers shop drawings + product data sheets + material samples + mock-ups + test reports), a new Inngest function pulls structured fields via Haiku — submittal number, subject, CSI MasterFormat spec section, project ref, submitting / reviewing parties, type, dates, full disposition string, and a derived status bucket (under-review / approved / rejected). Persisted to a new `submittals` table; migration 0036_aec_submittals.sql.
- Shipped
Disposition wording preserved verbatim
Architects + engineers use rich disposition vocabulary ("Approved as noted — rev. mullion finish to RAL 9006", "Revise and resubmit", "No exception taken"). The verbatim disposition string lives on the row alongside the derived status bucket — firms don't lose their canonical wording. Status bucket drives the queue; disposition shows on every row in the approved + rejected sections.
- Shipped
New /submittal-tracker page mirroring /rfi-tracker
Five status counters (Under review / Overdue / Due in 7d / Approved / Rejected), top-5-spec-sections-by-pending-count breakdown, and per-row detail (submittal #, subject, spec section, type, parties, required-by date). Overdue submittals highlight red. CSI MasterFormat spec section is the natural AEC anchor for "where are reviews piling up?"
- Shipped
AEC vertical pair complete
RFIs (yesterday) + Submittals (today) cover the two highest-volume AEC documents. Same proven pattern: doc-type-hint detection → Inngest extractor → projection table → daily-action surface with overdue highlighting. Change orders are next; the infrastructure is now reusable.
v0.1.70AEC RFI tracker — Phase 4 vertical work pulled forward at /rfi-tracker- Shipped
New rfis projection table + extract-rfi Inngest function
When the auto-classifier flags a doc as RFI / Request For Information (the existing doc-type-hint pattern), a new Inngest function pulls structured RFI fields via Haiku — RFI number, subject, project ref, requested by / requested of, location (spec section / drawing reference), issued + due dates. Same shape as the AP-invoice + AP-receipt extractors. Persisted to a new `rfis` table; migration 0035_aec_rfis.sql.
- Shipped
New /rfi-tracker page with morning-stand-up view
AEC vertical's daily-action surface mirroring /ap-review. Five status counters (Open / Overdue / Due in 7d / Answered / Rejected), top-5 projects by open count, and per-row detail (RFI #, subject, project, requested by → requested of, location, due date). Open RFIs past their due date highlight in red — the "what's blocking the project right now" view a superintendent looks at every morning.
- Shipped
Phase 4 AEC vertical work pulled forward
Phase 4 (AEC module) was scheduled for months 12-18; this lands the most-asked AEC tracking artifact at month 0. RFIs are the highest-volume document on a typical commercial construction project (200-2000 per build); having a dedicated tracker is the first ask from any AEC pilot prospect. Sidebar + mobile nav surface "RFIs" between AP review and Compliance.
v0.1.69Access explorer — current-state view of who can see what- Shipped
New /access page (owner / admin only)
The missing third leg of the access-control surface. /members shows roles + invites (workspace onboarding). /audit shows mutation events (time-series). /access is the *current-state* view of who has what — queryable two ways: (1) by principal — pick a workspace member, see every grant attached to them; (2) by resource — paste a document or collection id, see every grant scoped to it.
- Shipped
Powers two auditor questions
"What does this user have access to?" — for offboarding, least-privilege review, post-incident audit. "Who can see this matter / this document?" — for partner review of who's been granted on a privileged matter, pre-deposition exhibit access, quarterly attestation. Both questions answered in one click without writing a SQL query against the permissions table.
- Shipped
Member picker shows grant counts at a glance
Side panel lists every workspace member with their role + grant count ("alice@firm.com — admin · 12 grants"). Click to drill into the per-principal view. Resource search box accepts exact equality first, then substring fallback for resource pattern fuzzy-match. Capped at 200 rows.
- Shipped
Owner / admin nav entry
Sidebar + mobile nav surface "Access" between Members and the existing API keys / Webhooks / Audit log entries. Owner / admin only — non-admin members never see the entry. The /access page itself rejects non-admin requests with the standard PermissionDenied component.
v0.1.68Inline Office preview — Word + Excel render inline on /doc/[id]- Shipped
Word documents render inline with mammoth
New /api/doc/[id]/render endpoint converts .docx bytes to formatted HTML via mammoth and serves it as a sandboxed iframe document. /doc/[id] now shows the contract, memo, or pleading inline with paragraphs / lists / tables / formatting preserved — no more "download to view" for the most-common Office format. Same permission gate as /preview + /download. Brand-aligned Cormorant Garamond + Plex Sans styling so the iframe blends with the page.
- Shipped
Excel workbooks render inline with SheetJS
Same endpoint also handles .xlsx, .xls, .ods — first 10 sheets get rendered as HTML tables (one per sheet, with the sheet name as a heading). Operators can scan a pricing model or matter-tracking workbook on /doc/[id] without opening Excel. Cell values + basic table layout preserved; formulas stripped (the displayed value is what shows).
- Shipped
Tight CSP sandbox
The rendered HTML loads in an iframe with sandbox="allow-same-origin" plus a strict CSP (default-src none, no script, no external resources, no fonts, only inline styles + data: images). A malicious .docx can't exfiltrate or execute anything inside the iframe.
- Shipped
25 MB cap, lazy-loaded libraries
Render endpoint caps documents at 25 MB to keep server-side parse time bounded. mammoth + xlsx are lazy-imported so they stay out of the Edge runtime + the dashboard bundle entirely; loaded only when a render request actually arrives.
v0.1.67Word-style redline compare — inline insertions + deletions for legal review- Shipped
Word-level inline redline view on /doc/[id]/compare
New default view on the compare page renders insertions in green and deletions in red strikethrough — the "Track Changes" reading legal + AEC reviewers want for catching adds + removes in proximity. Uses diff's diffWordsWithSpace so the rendered output preserves the original document's word boundaries while highlighting changes inline. Char-level + / − counters at the top show the magnitude of the diff at a glance.
- Shipped
Mode switch — Redline vs Lines
Two-state pill above the diff lets reviewers switch between the new Word-style redline (the legal-review default) and the previous unified line-level view (still useful for code, structured data, and machine-parsed CSVs). Mode rides in the URL query string so a specific mode is shareable. Server-rendered switch — no client state, no flicker on toggle.
- Shipped
Litera-killer view, no third-party dependency
Incumbents (iManage Workspace, NetDocuments) integrate with Litera for redline compare — paid third-party software with per-seat licensing. Kodori's redline ships built-in, no extra contract. Quality matches Litera for prose-document review (contracts, RFIs, memos, pleadings) at the word-level granularity legal-review workflows actually need.
v0.1.66AP line-item match — per-line variance flagging on invoices and receipts- Shipped
ap_invoice_lines + ap_receipt_lines projections
New schema tables capture per-line breakdowns from AP invoices and receipts. Each row carries description, item code, quantity (numeric(18,6) for fractional units), unit price + total in cents. Migration 0034_ap_line_items.sql. The next level of three-way match — beyond document-total reconcile (D93) — for cases where invoice + receipt totals agree but specific lines diverge (a vendor billed twice for one item but waived another).
- Shipped
Extractors pull line items inline
extract-ap-invoice + extract-ap-receipt extended to ask Haiku for line items in the same call (no extra LLM round-trip). Each line: description / itemCode / quantity / unitPrice / total, all nullable for sparse documents. Capped at 50 lines per doc per the prompt; 100 hard cap on the schema. Persisted in a delete-then-insert step so re-extraction always reflects the latest pass.
- Shipped
"View line items" expander on /ap-review
New per-row expander loads invoice + receipt line items lazily via a server action. Renders side-by-side with item-code-first matching (then description-exact, then line-number fallback). Each pair shows ✓ matched (totals within $1), ! variant (totals diverge), or unpaired (one side has the line, the other doesn't). Header summary: "5 matched · 2 variant · 1 unpaired" — operators see the line-level posture before approving the document-level total.
- Shipped
Pairing algorithm: item code → exact description → line number
Two-pass pairing. Pass 1 matches by item code (the strongest signal vendors print). Pass 2 falls back to exact-text description match for any line that didn't match by code. Unpaired lines on both sides surface for operator review. $1 / 100-cent tolerance per line — looser than the document-level $5 / 1% because per-line rounding is more common.
v0.1.65Workspace overview dashboard — five-second executive read at /overview- Shipped
Single-pane executive summary
New /overview page pulls every signal Kodori already tracks into one composite view: document totals + 24h/7d ingest velocity, active legal holds, retention review depth, audit chain tip + last verification, AP queue health (pending / variance / awaiting receipt), agent activity in last 24h, and admin-only tiles for anomalies + cap utilization + cost links. Each tile deep-links into the operational page for that area. Counts respect per-document permissions; you only see what you can read.
- Shipped
Differentiates from /dashboard
The existing /dashboard is the daily-action surface (recent docs to review, ⌘K agent shortcut, classifier suggestions to accept). /overview is the steady-state executive read — what the partner / GC / compliance officer sees when they want a five-second status check. Different audience, different layout. Both surface from the sidebar nav.
- Shipped
Color-coded urgency tones
Tiles tint when relevant signals fire — amber for "active legal holds present" or "AP awaiting receipt", red for "AP price variance" or "anomalies open" (admin), ochre highlight for "AP pending" (the operator's daily action). Steady-state tiles stay neutral. The page reads at a glance — no clinical scanning of bare numbers.
- Shipped
Live data, no batch snapshot
Recomputes every render — no daily aggregate to explain. Concurrent queries via Promise.all keep the page snappy (one round-trip per query in parallel). Per-tile counts are exact for the moment the page loaded. Sidebar nav entry for "Overview" sits between Dashboard and Search on desktop and mobile.
v0.1.64Compliance reporting dashboard — auditor-ready evidence reports at /compliance/reports- Shipped
Five pre-baked reports for the auditor's working papers
New /compliance/reports page (owner / admin only) with five reports your auditor, compliance officer, or security reviewer expects to see: Retention disposal log (every doc tombstoned with reason + actor), Legal hold log (every hold ever opened or released with subjects + dates), Audit-chain verification log (weekly cron + on-demand verifications, the artifact SOC 2 + 21 CFR Part 11 reviewers ask for as proof of "we verify even when no one's watching"), DSAR fulfillment log (per-tenant + per-user data exports), SOC 2 control evidence map (CC1-CC9 + Confidentiality controls mapped to Kodori implementations + evidence pointers).
- Shipped
One-click CSV export per report
Each report ships with a /compliance/reports/[slug]?format=csv route that returns RFC-4180 CSV with Content-Disposition: attachment. Quoted-everywhere format keeps Excel locale-imports unambiguous; capped at 500 rows inline (use /api/tenant/export for larger). Auditors paste straight into their working papers.
- Shipped
Live data, no batch snapshot
Reports query the live audit log + projection tables at request time — no daily batch, no cached aggregate to explain to an auditor. The numbers in the report match what's in the system this second; the CSV preserves the same ordering for traceability. Differentiates Kodori vs incumbents (iManage, NetDocuments, FileHold) where auditors hand-build SQL queries against backup snapshots.
- Shipped
Cross-linked from /compliance + /audit
/compliance (the live operational dashboard) now carries a callout pointing at /compliance/reports. Each row in the audit-verification + retention-disposal reports deep-links into /audit?stream=… so the auditor can drill from the summary into the raw event log in one click.
v0.1.63Excel + PowerPoint task panes — Office trifecta complete (Outlook + Word + Excel + PowerPoint)- Shipped
Kodori task pane for Microsoft Excel
New manifest at /office/excel/manifest.xml installs a "Save to Kodori" button on Excel's Home ribbon. Same Save flow as Word: New document or New version of existing, with the workbook bytes captured via Office.js getFileAsync('compressed') and POSTed through the existing bulk-ingest + versioning APIs. Excel-specific MIME (`application/vnd.openxmlformats-officedocument.spreadsheetml.sheet`) and source tag (`excel-add-in`).
- Shipped
Kodori task pane for Microsoft PowerPoint
New manifest at /office/powerpoint/manifest.xml installs the same Save button on PowerPoint's Home ribbon for deck save-back. Pricing decks, pitch decks, and matter-status presentations follow the same New-doc / New-version pattern Word uses.
- Shipped
Shared OfficeSaveTaskpaneClient component
Refactored Word's 800-line task-pane client into a reusable component at apps/web/components/office-save-taskpane.tsx parameterized by `host: "word" | "excel" | "powerpoint"`. Per-host config map sets MIME type, brand label, source tag, and example placeholders; the rest of the UX (read-bytes-via-slices, save-via-bulk-ingest-or-versioning, search-via-public-API, status state machine) is identical across all three hosts. One file to fix when an Office.js bug surfaces, three pages of thin wrappers that pass the right host prop. Same Kodori API key in roamingSettings works for all three (and Outlook).
- Shipped
/office landing surfaces all four manifest URLs
Public install page now shows Outlook + Word + Excel + PowerPoint manifest URLs side by side. Each auto-points at the deployment's own origin so self-hosted Kodori works without a build step. Hero updated to lead with the four-add-ins narrative.
v0.1.62AI document generation from templates — draft a new contract from your firm's NDA in 30 seconds- Shipped
Generate from template — Claude Opus drafts new docs based on existing ones
Any Kodori document with extracted text can be used as a structural template for new drafts. Click "Use as template →" on /doc/[id], type instructions ("Draft an NDA between Acme and BigCo, mutual, 1-year, NY law"), and Kodori asks Claude Opus to generate a new document that follows the template's structure but adapts to the new context. Result lands as a new Kodori record (markdown — opens cleanly in any editor, pastes into Word with formatting preserved). Bracketed placeholders ([DATE], [JURISDICTION]) flag where the human reviewer needs to fill in unspecified values.
- Shipped
Wow-factor differentiator no incumbent has
iManage, NetDocuments, FileHold all treat documents as files — open in Word, edit, save back. Kodori treats documents as data the agent can compose new work from. The same template that generates 50 NDAs a year for a litigation team is now reusable in seconds rather than minutes per doc. Generated drafts carry `metadata.generatedFromTemplate` linking back to the source so the audit chain captures the lineage.
- Shipped
Quota-aware: counts against agent questions, falls back on Opus cap
Generation runs through the existing agent provider — counts as one agent question for billing, uses Opus by default for drafting quality, silently falls back to Haiku when the workspace is over the Opus reasoning cap. Same enforcement pattern as the chat panel.
v0.1.61Word add-in — save drafts as new docs or new versions, from inside Word- Shipped
Kodori task pane for Microsoft Word
New Word add-in shipped at /office/word. The manifest installs a "Save to Kodori" button on Word's Home ribbon — open any document, click the button, and the task pane opens with two save modes: New document (the active doc lands as a brand-new Kodori record) or New version (search for an existing Kodori doc, pick it, save the open .docx as the next version). Same Office.js + roamingSettings + bearer-token API key flow as the Outlook add-in; one Kodori API key signs into both.
- Shipped
POST /api/v1/documents/{id}/versions
New REST endpoint accepts raw bytes + X-Kodori-Version-Label / X-Kodori-Version-Significant headers and appends a version to the existing document. Same gates as the desktop /doc/[id] re-upload (documents:write scope, tenant + read permission, check-out lock honored, duplicate hash returns unchanged=true). The endpoint composes with the SDK and the public MCP server too — every external integration that wants to "save a new version" of a Kodori doc lands here.
- Shipped
Document picker via existing /api/v1/search
When the user picks "New version" mode, the task pane reuses the public hybrid-search endpoint to find candidate documents — typing "Smith NDA" runs the same search the dashboard does and lists matches with mime-type chips. No new endpoint needed; one less surface to maintain.
- Shipped
Slice-by-slice document read via Office.js
Word's getFileAsync streams the active document as a compressed (OOXML) file in 4MB slices. The task pane collects them, reassembles in the browser, and POSTs the bytes — typical 100KB-2MB legal drafts complete in a single round-trip; longer docs (50MB+ pleadings with embedded exhibits) hit the 50MB API cap. Read progress surfaces as "Reading… (slice 3/12)" so the operator sees what's happening on big docs.
- Shipped
/office landing page now covers both add-ins
Public install page surfaces both Outlook and Word manifest URLs side by side — both auto-point at the deployment's own origin so self-hosted Kodori at dms.acme-corp.com gets a manifest pointing at their domain without a build step. Hero copy updated to lead with the two-add-ins narrative.
v0.1.60Inline PDF viewer with find-in-document — same UX across every browser- Shipped
PDF.js-powered inline viewer on /doc/[id]
PDFs now render through pdfjs-dist instead of the browser-native iframe — consistent rendering across Chrome / Safari / Firefox / Edge with Kodori-branded toolbar chrome. Page navigation (prev / next / type a number), zoom controls (+ / − / fit-width), and a real find bar replace the variable-quality browser viewer. Lazy-loaded — pdfjs is dynamic-imported the first time a PDF loads, so the dashboard bundle stays slim for users who never open a PDF.
- Shipped
Find-in-document search across every page
Cmd-F / Ctrl-F or the magnifier button opens a search bar that walks every page's extracted text content, surfaces hit count + per-hit page snippets ("p.7 …signed by both parties on the…"), and jumps to the matching page on click. 200-hit cap keeps the UI responsive on long documents; refine the query when you hit it. Daily-use UX gap from the iframe approach is closed.
- Shipped
Selection + copy still works
pdfjs renders text on a transparent overlay above the canvas, so selecting + copying text from the rendered PDF behaves exactly like a native viewer. The extracted text the search bar uses is the same text the user can copy.
v0.1.59Migration commit — connectors now move bytes, not just metadata- Shipped
commitConnectorJobAction with batch-by-batch ingest
Discovered scopes can now be committed — pull bytes from the source DMS, register each document via the existing CAS pipeline, fire extraction. New server action runs synchronously, capped at 100 docs per batch to fit the Vercel timeout budget; click Commit again on partial jobs to process the remainder. Per-doc cap enforcement (documents + storage quotas) runs inside the loop so a tenant near plan caps fails cleanly before reading bytes.
- Shipped
Credentials never persisted in Kodori
Connector credentials (iManage OAuth secrets, S3 access keys, NetDocs refresh tokens) are re-supplied at commit time and held in-memory only. Once each batch finishes the credentials object is garbage-collected — Kodori's DB never sees them. The cost is "operator types creds twice" (discovery + commit); the gain is one fewer attack surface for the source-of-truth keys-to-the-kingdom secrets.
- Shipped
Per-document migration metadata
Every committed document carries the full provenance — `migrationJobId`, `migrationSource`, `migrationPath` (source DMS folder hierarchy), `migrationCreatedAt`, `migrationCreatedBy`, plus the connector's discovered metadata block (iManage custom1..custom30, S3 sidecar JSON, etc.) merged in. Search "everything we ingested via the iManage migration" works as a hybrid-search query against this metadata.
- Shipped
Per-job commit progress + resumable batches
Job payload tracks per-document status (pending / done / failed) keyed by stable index. Re-entrant commit calls walk the pending set without re-attempting docs that already succeeded. Failed docs accumulate in the job's failureLog (capped at 100) with the error reason, surfaced on the recent-jobs list as `N failed`. Job status flows discovery-complete → committing → commit-partial → commit-complete (or stays at commit-partial if any docs failed).
- Shipped
"Commit" button on the connectors page
Each discovery-complete or commit-partial job in the recent-jobs list now carries a Commit button showing pending count. Click opens an inline credentials form (JSON shape — same as discovery), runs the batch, reports per-batch + cumulative results inline. Kodori's dashboard refreshes after each batch so the row reflects the new applied/failed counts.
v0.1.58Outlook task pane v1.1 — thread tracker, per-filing sensitivity, collection picker- Shipped
Per-conversation thread tracker
When you open a message in a thread you've already filed from, the task pane shows "3 already filed from this thread" with subjects + deep-links to /doc/[id] for the most recent three. Persists in the browser's localStorage keyed on Outlook's stable conversationId, so the count survives Outlook restarts. 100-thread × 30-day rolling window keeps storage bounded; older entries prune automatically. Closes the "did I already file this email?" question that pilot users hit on long-running matters.
- Shipped
Per-filing sensitivity override
New "Sensitivity (this filing)" dropdown in the task pane that overrides the workspace default for one message only. Resets to the default each time you open a new message, so setting "regulated" once for one privileged email doesn't stick on every subsequent file. The DLP scanner still runs on ingest and can auto-escalate further; we never silently lower the tier on filed records.
- Shipped
Collection picker in the task pane
Added a Collection dropdown that lets you assign filed messages directly to a Kodori Collection (matter / project / cabinet / etc.) at filing time. Fetched live via /api/v1/collections using your bearer token, so the picker reflects every collection your role can see. Sets X-Kodori-Collection-Id on the bulk-ingest API call, so the body + every attachment join the collection in one shot — replaces the post-filing "open dashboard, drag into collection" step. Empty-list state links to /collections so a paralegal who hasn't built the matter yet sees the path.
v0.1.57Migration connectors — pull from iManage / S3 / NetDocs / FileHold into Kodori- Shipped
Migration-connector abstraction in @kumokodo/migration
New workspace package defining the MigrationConnector interface — probe / discover / download — plus a registry. Each connector self-describes status (ready / beta / planned), credentials schema (Zod), and availability. The UI renders the picker straight from the registry; planned connectors lock the credentials shape now so secret storage is forward-compatible. Connector source ids match the values long pre-allocated on `migration_jobs.source` (imanage, netdocuments, filehold, s3-bucket).
- Shipped
S3-bucket connector — universal incumbent-DMS migration path
Walks any AWS S3 / Cloudflare R2 / MinIO / Backblaze (S3-compatible) bucket. Subdirectory paths become Kodori Collection paths; sidecar `<filename>.kodori.json` files attach per-document metadata (matter id, custom fields) without inventing a manifest format. The 60% case where a customer's IT team can run a bulk export but doesn't want to wire a new API integration. Ships ready (`status: 'ready'`).
- Shipped
iManage connector (beta)
Real REST integration against iManage Work Cloud at `https://<customerId>.imanagework.com/api/v2/`. OAuth2 client-credentials auth (with optional user impersonation), paginated document discovery, custom-field metadata pass-through (custom1..custom30), download via the document content endpoint. Code is real but field-tested only after the first customer migration project — listed as `status: 'beta'`. Operators with iManage developer credentials can run it today.
- Shipped
NetDocuments + FileHold skeletons (planned)
Both connectors have locked credentials schemas + the runtime path that throws "not-implemented" with a clear roadmap message. NetDocuments lands Q3 2026 (when the first NetDocs prospect signs); FileHold lands Q4 2026 (most FileHold customers can use the S3-bucket connector with their admin export today). The UI surfaces both with a "Coming Q3 / Q4 2026" banner instead of letting an operator fail mid-migration.
- Shipped
/migrate/connectors UI with probe-then-discover
Owner / admin page listing every registered connector. Per-connector card with credentials form, optional scope (workspace id / folder path / max documents), and a Probe / Start Discovery button pair. Probe validates credentials read-only; discovery walks the source up to 1000 docs synchronously and persists results to `migration_jobs`. Recent jobs list at the bottom of the page shows every connector + CSV job side by side. Linked from /migrate as the entry point for "migrating from an incumbent DMS"; the existing CSV path remains the post-ingest metadata-mapping flow.
v0.1.56AP three-way match — receipts close the loop on PO ↔ receipt ↔ invoice- Shipped
Receipts as a first-class projection
New `ap_receipts` table with the same shape as `ap_invoices` (vendor, PO ref, total, currency, received-at). When the auto-classifier flags a doc as receipt / packing slip / goods-received note / delivery note, a new Inngest function extracts the structured fields via Haiku and writes the projection. Migration `0033_ap_three_way_match.sql`.
- Shipped
Three-way match in the AP-invoice pipeline
Extracting an invoice now reconciles against any receipt for the same PO number, persisting `matched_receipt_document_id`, `match_status` (matched / price-variance / no-receipt / no-po / pending), and signed `variance_cents` on the invoice row. Tolerance is the maximum of $5 absolute or 1% of invoice total — anything inside that lands matched, anything outside flags variance. The match status is recomputed every time either side updates.
- Shipped
Late-arriving receipts retroactively reconcile
Receipts often land after the invoice (the AP clerk files the invoice immediately, the receiver scans the packing slip later). The receipt extractor now sweeps every existing invoice for the same PO and updates their match status in place — no operator "kick" required, no stale `no-receipt` rows. Capped at 50 invoices per receipt to keep the Inngest step bounded.
- Shipped
Three-way match status in the /ap-review UI
Each invoice row now renders a match-status badge (3-way matched / variance / awaiting receipt / 2-way only) alongside the approval-status badge, plus a per-row line showing the linked receipt doc when matched, the variance amount when it disagrees, and "no receipt yet" when the PO matches but no receipt exists. Header stats grew Variance and Awaiting-receipt counters with red / amber tinting when non-zero — an AP clerk sees "3 invoices have variance" before scrolling.
- Shipped
Closes the Phase-1 AP exit-criterion
The Phase-1 scope listed AP-invoice end-to-end as the first complete workflow, with three-way match noted as roadmap. With receipts as a first-class projection plus reconcile-on-extract, the AP workflow is now feature-complete for the document-level match case (line-item match remains future work for customers needing per-line variance tracking).
v0.1.55Outlook add-in foundation — file email + attachments from the inbox ribbon- Shipped
Kodori task pane for Microsoft Outlook
New Outlook add-in shipped at /office. The manifest XML installs a "File to Kodori" button in the message-read ribbon — open any email, click the button, and the task pane loads inside Outlook. Paste a Kodori API key once (stored in Outlook's roamingSettings, which sync across the same mailbox on every device) and the message body plus selected attachments file as Kodori documents through the existing POST /api/v1/documents endpoint. Each document carries the email's subject, sender, recipients, and received timestamp in metadata, then runs through the same DLP-scan + auto-classify pipeline as desktop uploads. Replaces the legacy "drag-attachment-to-DMS" workflow that incumbents (iManage, NetDocuments, FileHold) all do via per-vendor desktop integrations and clunky Outlook ribbons.
- Shipped
Self-hosting-aware manifest XML
The manifest is generated dynamically per-request from the deployment's own origin — a customer self-hosting Kodori at dms.acme-corp.com gets a manifest pointing at their domain without a build-time configuration step. Manifest Id is a stable v5-style UUID derived from the origin host so Outlook treats it as the same add-in across reloads. Cache headers keep manifest changes propagating within an hour. Sideload by URL or by downloading the file — same outcome.
- Shipped
Per-attachment status + cloud-link refusal
Task pane renders a checkbox per attachment with size and type. File-type attachments (the downloadable kind) ship as separate Kodori documents; cloud-link attachments (OneDrive / SharePoint shared links) are flagged unfilable and skipped with a clear status pill — Outlook's API can't pull their bytes. Filing runs serially with per-tile progress (queued / uploading / done / failed / skipped) and a deep-link to /doc/[id] on every successful row.
- Shipped
Install + sideload landing at /office
Public landing page with the manifest URL, four-step install instructions for both the web and desktop Outlook flows, "how filing works" section, and an FAQ covering admin centralization, mobile support, conversation threading, and the 50 MB cap. Linked from /api-keys so anyone minting a documents:write key sees the add-in install path immediately.
v0.1.54Mobile-first capture — photograph a document, file it, walk away- Shipped
Capture surface at /capture
New mobile-first page that opens the phone's rear camera directly via the HTML5 capture="environment" attribute — no app install, no driver, no scan-to-email round-trip. Take one shot or many; each tile uploads independently to R2 the moment it lands, then registers the document and kicks extraction. By the time the user finishes capturing, the earlier shots are already filed. Replaces the legacy DMS workflow of "scan to a network printer, email yourself, find the file later, drag it into iManage." Targets the courthouse paralegal, the AEC superintendent at a job site, and the partner reviewing exhibits in a deposition room.
- Shipped
Installable PWA with a Capture shortcut
apps/web/app/manifest.ts wires the Web App Manifest so iOS Safari and Android Chrome offer "Add to Home Screen." Once installed, Kodori launches as a standalone window — no address bar, full-height capture viewport. The manifest declares /capture as a long-press shortcut, so power users (paralegals filing dozens of shots a day) deep-link straight into capture mode without touching the dashboard. Theme color and background match the brand ochre + paper.
- Shipped
Viewport + theme-color metadata wired
Root layout exports a Next 15 viewport object with brand-matched themeColor for both light and dark schemes. Android's status-bar tint and iOS's installed-tile chrome now read as part of the Kodori surface instead of foreign white/system chrome. user-scalable stays on — capture preview thumbnails benefit from pinch-zoom and WCAG 1.4.4 requires it.
- Improved
Sidebar + mobile nav surface Capture next to Upload
New Capture entry in the (app) sidebar — desktop sidebar and the horizontal mobile rail — sits between Upload and Legal holds. Two ingest paths surfaced as peers.
v0.1.53Reversibility hardening — every state-changing mutation now revertable from /audit- Shipped
Revert flow extended from 4 event types to 12
Every state-changing mutation on the audit log now has a one-click revert button: document tombstoning, sensitivity changes, rename, metadata patches, retention class assignments, check-out, legal-hold per-subject membership, annotations, retention class archive — alongside the original collection-membership and permission grants. Each routes to its inverse MCP tool using the original event's payload to recover the previous-state values.
- Shipped
unarchiveRetentionClass MCP tool
New first-class tool that clears archivedAt on a retention class and emits retention.class-unarchived. Used by the /audit revert flow when the original event was retention.class-archived; also callable directly by the agent ("undo the archive of AP-7 we just did").
- Improved
Half-revertable events refuse with operator guidance, never mis-apply
document.metadata-set with field=dlp-finding-decision or field=aec returns a clear error pointing the operator at the right surface (DLP findings panel; setAecMetadata directly), instead of a best-effort guess that could mis-revert the previous-state blob. Same pattern for hold-creation events (refuses with "use /legal-holds release flow with a reason") and retention deferrals (refuses with "defer again on /retention/review with the prior date").
- Improved
Information events stay intentionally non-revertable
document.read, document.dlp-flagged, anomaly.detected, audit.verification.completed, tenant.* — these are signals or external state, not state changes the workspace can undo. The revert button doesn't appear on these rows; putting one there would mislead.
v0.1.52Google Document AI extractor — Phase-1 IDP fallback shipped- Shipped
Google Document AI extractor in the IDP cascade
New extractor at packages/workflow/src/extractors/google-docai.ts slots between the Office adapters and the Claude vision fallback in the registry. When configured (GOOGLE_DOCAI_PROJECT_ID + GOOGLE_DOCAI_PROCESSOR_ID + credentials), PDFs and images route through Google Document OCR for higher-quality output than Claude vision on dense scans + handwriting, at lower per-page cost. Self-reports supports=false until env is wired, so a partial deployment is a no-op. Lazy SDK import means deployments that never extract a PDF don't pay the gRPC cold-start cost. 20 MB sync cap; larger files surface as unsupported until batchProcess + GCS staging lands. Closes the last documented Phase-1 IDP gap.
- Improved
Three-tier IDP cascade now real, not just documented
Phase-1 always specified "Azure DocIntel primary, Google DocAI fallback, OSS fallback." Azure has been a stub since Phase 0 (waiting on resource provisioning); Google DocAI was vapor. Now: Azure stub (flips on when provisioned), Google DocAI (real, ship-ready), Claude vision (LLM-fallback), Office adapters + builtin-text (free OSS paths). Customers can pick the cloud OCR vendor matching their compliance posture: EU customers preferring Google's eu region, US-Federal customers needing Azure Gov, etc.
v0.1.51Official TypeScript SDK + weekly audit-chain verification cron- Shipped
Official TypeScript SDK at @kumokodo/kodori-sdk
Wraps the public REST API with typed methods (kodori.search.run, kodori.documents.list / get / rename / setSensitivity / setMetadata / tombstone / restore, kodori.collections.list / create / addMember / removeMember). Throws a typed KodoriApiError on refusals with structured code + details. Plus MCP helpers under the /mcp subpath: mcpEndpointUrl, bearerAuthHeader, claudeDesktopConfig (returns the JSON snippet for claude_desktop_config.json), cursorMcpConfig. ESM-only, Node 18+ / Bun / Deno / Cloudflare Workers. Pairs with Anthropic's @modelcontextprotocol/sdk for the full MCP-client pattern; we don't ship a parallel MCP client.
- Shipped
Audit-chain verification weekly cron
New Inngest function runs every Sunday 02:00 UTC, walks each tenant's chain via prev_hash links, and emits audit.verification.completed onto a per-tenant audit-verification/<tenantId> stream so the audit log carries a continuous "we proved this" trail. On failure, broadcasts a structured email to every member of the affected tenant with the first-mismatch detail. The on-demand /audit verifier was a strong claim; the cron is the load-bearing artifact for SOC 2 — auditors asking "do you verify even when no one's watching?" get a yes with weekly timestamped events. New event type audit.verification.completed.
- Improved
Verifier logic shared across surfaces
Chain-walking algorithm extracted from apps/web/app/actions/audit-verify.ts to packages/events/src/verify.ts. The on-demand /audit button, the MCP verifyAuditChain tool, and the new weekly cron all run identical code from the shared helper. Single source of truth for the chain-walking algorithm — change which fields the chain hashes in one place.
v0.1.50MCP positioning sweep — homepage, comparison pages, /api-keys, /security/controls- Shipped
Homepage leads with "First DMS with a public MCP server"
Feature 12 on the homepage upgraded to lead with the MCP claim; REST API + signed webhooks now Feature 13. TrustStrip gained "MCP-native" alongside Event-sourced / Permission-trimmed / Reversible / Audit-logged. The biggest unique-in-category claim Kodori makes — surfaced where buyers actually see it.
- Shipped
Comparison pages — new "Public MCP server" row
/compare/imanage, /compare/netdocuments, /compare/filehold each gained a dedicated row showing that incumbents have no MCP surface. ndMAX (NetDocuments) and Insight (iManage) are in-product chat assistants — third-party AI clients can't connect. Kodori lights up Claude Desktop, Cursor, ChatGPT desktop, and Kodokyo with one API key.
- Shipped
/api-keys in-app banner pointing at the MCP endpoint
Anyone reading /api-keys to wire an integration now sees an ochre callout: "Same key, more tools — your API key also authenticates against POST /api/mcp." Drives discovery to /help/mcp-server. Closes the gap where developers landed on /api-keys, minted a REST key, and never knew the same credential opened the MCP catalog.
- Improved
/help/api-keys-and-rest-api references MCP + Slack
The canonical API help article gained a section on the public MCP endpoint pointing at /help/mcp-server, plus a pointer to /help/webhook-format-slack. Summary line + keywords updated for "mcp" / "mcp server" so the help-knowledge agent surfaces the right article on integration questions.
- Improved
OpenAPI 3.1 manifest mentions MCP
The /api/openapi.json description now explicitly tells consumers their bearer key works against /api/mcp too, with a link to /help/mcp-server. Buyers reviewing the API spec in Postman / Stoplight see the cross-reference rather than discovering MCP separately.
- Improved
/security/controls — CC6.6 now references the MCP surface
The "logical access security" control row was already covering API-key scope filtering and TLS; expanded to call out that the public MCP endpoint accepts the same scoped keys, so external AI clients inherit identical authentication and scope-filtering as REST. Important for prospects auditing whether the AI surface is a privilege-escalation path. Source pointer at apps/web/app/api/mcp/route.ts.
v0.1.49Public MCP server, agent governance writes, Slack webhook format, Kodokyo via MCP- Shipped
Public Model Context Protocol server at /api/mcp
JSON-RPC 2.0 over Streamable HTTP per the 2025-11-25 MCP spec. Any conformant client (Claude Desktop, Cursor, ChatGPT desktop, Kodokyo, custom integrations) connects with a Kodori API key and calls the same 60+ tool catalog the internal agent uses. Tools filtered by API key scope (search:read / documents:write / collections:write / documents:delete). Every call audit-logged with actorKind=agent + actorId=user-who-issued-key. Stateless transport for serverless deployment. CLAUDE.md's biggest unfilled Phase-0 promise — closed.
- Shipped
Agent governance writes — 5 admin-shaped tools
acknowledgeAnomaly, dismissAnomaly, unpauseAnomalyAgent, decideDlpFinding, archiveRetentionClass. The introspection batch let the agent ANSWER governance questions; this batch lets it ACT. Each requires an explicit user-stated reason / note captured on the audit log — same pattern as destructive document mutations. Owner / admin role gate inline in each tool.
- Shipped
Webhook → Slack Block Kit
Pick "Slack" as the format on a webhook subscription and Kodori renders each event as Block Kit at the destination URL. Per-event humanizing — anomaly.detected becomes "Anomaly detected (high): high-volume-regulated-read"; legal-hold.applied becomes "Legal hold applied: Smith v. Acme — 24-cv-1234". Pair with the event-type filter for an "incidents only" channel or a fuller compliance feed. New format column on webhook_subscriptions (migration 0032).
- Shipped
Kodokyo integration via MCP
Kodokyo (kodokyo.ai, the sister project-OS product) ships an MCP-compatible agent. Drop a Kodori API key into Kodokyo's project secrets and add a workspace-level MCP server pointing at https://kodori.ai/api/mcp; Kodokyo's agent now has access to the entire Kodori tool catalog inside Kodokyo conversations. Project managers ask "find every NDA from this client and add it to the active matter" without leaving Kodokyo. No custom integration code on either side — the MCP protocol itself is the glue. The load-bearing claim of MCP-as-platform-protocol made concrete.
v0.1.4813 new agent tools — answer "where do I stand", "what happened", "any anomalies"- Fixed
Audit-chain verifier walks by prev_hash links, not timestamp
The first verifier release ordered events by createdAt and reported phantom mismatches whenever multiple events were appended in one transaction (PostgreSQL's now() returns the transaction-start timestamp, so events from extraction pipelines / Inngest fan-outs all share one timestamp). Fixed by walking via prev_hash links: build a successor map, find the genesis, walk forward. Order-independent. Reports richer break categories: no-genesis / multiple-genesis / fork / orphan / hash-mismatch.
- Shipped
tenantUsageSummary agent tool
Answers "how much disk space have my files used", "am I close to my plan caps", "what plan are we on", "how many seats are left". Returns plan / seats / storage / agent-quota with caps + an explicit warnings array for any cap at ≥80%. Numbers match the UI exactly because the provider calls the same getTenantUsage / snapshotCapWarnings helpers the dashboard uses.
- Shipped
12 introspection tools added to the agent catalog
listMembers (members + roles + sign-in providers), recentActivity (audit events with actor labels + counts by type), listLegalHolds (active / released / all), listRetentionClasses (definitions + assigned doc counts), listRetentionReviewQueue (docs past their retention awaiting disposal), listDocumentVersions, listDocumentEvents, getDocumentExtraction (status + extractor + page count + error), listAnomalies (signals + open count), listDlpFindings (perm-trimmed; pre-redacted previews only), listSavedSearches + runSavedSearch (per-user saved queries), listApiKeys (no plaintext), listWebhooks (24h delivery health), getTenantSettings (ingest email + plan + KMS posture), verifyAuditChain (the chain verifier as MCP). Closes the gap pilot users hit immediately when the agent had to apologize for missing tools.
- Improved
/members route hint stops apologizing
The agent's /members route hint used to say "the agent does not currently have a listMembers tool, so propose what to do next rather than enumerate." Replaced with a positive cue pointing at listMembers + recentActivity + tenantUsageSummary. Removing apologetic copy is as load-bearing as adding tools — if the prompt still says you don't have X, the agent might believe that even when it does.
v0.1.47Live audit-chain verifier, sub-processors page, per-user DSAR export, one-click unsubscribe- Shipped
Live audit-chain integrity verifier on /audit
New "Verify chain integrity" button (owner / admin only) walks every event in the tenant's chain and re-runs SHA-256 over each predecessor's canonical JSON to confirm prev_hash matches. Pass returns the count walked + latest event timestamp ("verified through 2026-04-26T18:14:02Z"). Fail returns a rich first-mismatch report (event id, type, stream, version, expected vs actual hash) for ops triage. The single biggest claim Kodori makes about tamper-evidence is now demonstrable in one click — a sales asset for SOC 2 / 21 CFR Part 11 / e-discovery prospects.
- Shipped
/security/subprocessors marketing page
Every third-party service that may process Kodori customer data (11 vendors today: Vercel, Neon, Cloudflare, Anthropic, OpenAI, Resend, Stripe, Inngest, WorkOS, Google, Microsoft) with vendor / purpose / what-they-actually-see / region / compliance-reports for each. Bottom section documents the 30-day-written-notice change policy, vendor-selection criteria (SOC 2 Type II minimum), and data-residency choices. Pairs with /security and /security/controls as the three documents a prospect's security review needs.
- Shipped
Per-user DSAR export at /api/me/export
Available to every member regardless of role. Streams a single ZIP with documents you authored that you can still read (permission-trimmed), every audit event you performed, your saved agent conversations, and your profile record. Closes the GDPR loop — Article 20 owner-driven portability already shipped; this covers Article 15 (Right of Access) and user-driven portability. Surfaced as a download button on /settings/account.
- Shipped
One-click onboarding-email unsubscribe (RFC 8058)
Drip emails now carry a real unsubscribe link backed by HMAC-signed tokens, plus List-Unsubscribe + List-Unsubscribe-Post headers so Gmail / Yahoo / Outlook surface a native unsubscribe button. Drip-only scope — invites, billing, and security alerts always send. New users.onboarding_unsubscribed_at column gates the cron so unsubscribed users don't get re-evaluated daily.
v0.1.46Per-tenant export, SOC 2 controls page, per-conversation prompts, unified search- Shipped
Per-tenant export — owner-only "everything we have on you" ZIP
GET /api/tenant/export streams a single archive with every readable document plus structured exports of collections, retention classes, legal holds, members, audit log, and your own agent conversations. Manifest flags any cap that tripped (1000 docs / 5 GB / 100k audit events). Built for GDPR Article 20 portability requests, pre-migration backup, and SOC 2 evidence handoff. Surfaced as a download button on /settings/account.
- Shipped
SOC 2 Type I controls mapping at /security/controls
Every AICPA Trust Services Criterion (CC1–CC9 plus the Confidentiality additional category) annotated with the concrete Kodori implementation and a pointer to where evidence lives in the running product. 30 controls in total, each tagged Live / Phase-1 / Phase-3. Designed as the first thing a prospect's security review hands their auditor.
- Shipped
Per-conversation system prompt overrides
Pin a custom preamble to one thread — "drafting agent" / "research agent" / firm-style — that travels with every turn. New "Prompt" button in the agent drawer header opens a dialog with a 4000-char textarea and three template chips. Closes the long-deferred "let me give the agent firm-style instructions per project" ask without touching the global system prompt.
- Shipped
Workspace-wide unified search
/search now has a "Search in" pills row toggling Documents / Conversations / Audit. Documents stays the default; toggling Conversations and Audit fans out three searches in parallel and renders three labelled sections. Each non-doc hit deep-links into its surface (drawer history for conversations, /audit for events). Source param preserved in the URL so a specific combo is shareable.
v0.1.45Collection export as ZIP — matter / project handoff bundle- Shipped
"Download as ZIP" on every /collections/[id] page
One-click streaming download of every pinned document in a matter / project / cabinet, plus a manifest.csv + manifest.json. Permission-trimmed via canReadDocument so a viewer's zip is filtered to what they could already open. Caps: 200 documents and 1 GB total bytes per call. Built for the typical mid-pilot lifecycle: ingest one matter, hand off the bundle when it closes. /help/export-collection-as-zip has the full spec.
v0.1.44Conversation export + invite-accept notification + doc sweep- Shipped
Export an agent conversation as Markdown
New "Export" button in the agent drawer header (visible whenever a conversation is loaded) downloads the active thread as a .md file: title, ISO timestamps, per-turn message body, and tool-call summaries. Useful for sharing a thread with a teammate, archiving a research session, or pasting into a doc as a record of decisions reached with the agent.
- Shipped
Invite-accept notification email
When an invitee clicks Accept, the inviter gets an email: "Sam (sam@kumokodo.ai) joined Acme Workspace as contributor" with a one-click link to /members. Closes the "did they accept?" loop without forcing the inviter to refresh. Fire-and-forget from applyInviteAction; missed notifications never fail the move.
- Shipped
Doc sweep — scope §15, decisions D74–D75, /features, help articles
Caught the canonical docs up to current main. New stack-decisions D74 (member lifecycle: move-to-personal-tenant + JWT-staleness check) and D75 (onboarding emails: Inngest events + edge-safe dispatcher). New /help articles for managing-members and onboarding-emails. /features marketing page expanded with the recent admin features.
v0.1.43Onboarding emails (welcome + drip), /agent redirect, audit row grouping- Shipped
Welcome email on first sign-in
A "Welcome to Kodori" email goes out the moment a brand-new user lands in their workspace, with three concrete things to try (⌘K agent, drag-drop upload, sample data). Dispatched via an Inngest event from the JWT callback so middleware (Edge) stays light — the actual Resend send happens in a serverless function with the full library access.
- Shipped
Drip schedule — day 3, 7, 14
Daily Inngest cron at 09:00 UTC sends three follow-up tips spaced across the first two weeks: "Try the agent" (day 3), "Load sample data" (day 7), and "Compliance side of Kodori" (day 14). Per-(user, kind) idempotency via the new onboarding_email_log table — Resend message id captured for ops correlation. Per-run cap of 50 sends per kind so a post-outage backlog doesn't burst.
- Shipped
/agent route → /dashboard?agent=open
The dedicated full-page /agent route was useful while the drawer was still gaining capability; now that the drawer has persistent conversations, citation chips, full-width expand, history search, and the usage strip, /agent is just a less-good copy. Bookmarks redirect to /dashboard with the drawer auto-opened (and the query param cleaned up so a refresh doesn't re-open if the user closed it).
- Shipped
Audit log row grouping
Adjacent same-type-same-actor events within a 60-second window collapse into a single row with a "× N" badge. A bulk action that emits 50 collection.member-added events shows as one expandable row instead of 50 separate ones — much easier to scan. Click to drill in. Group key is (type, actorId, streamPrefix) so cross-document sequences still group cleanly.
v0.1.42Leave workspace + Transfer ownership- Shipped
Self-initiated leave
Members can leave a workspace from /members ("Leave" link next to your row) or from /settings/account ("Leave this workspace" panel). Same plumbing as admin-initiated remove — moves the user into a fresh personal tenant where they're the owner of their solo workspace, signs them out, sends them through /sign-in?moved=1. Refuses if you're the last owner.
- Shipped
Transfer ownership
Owner-only "Make owner" link on every non-owner row promotes the target to owner without demoting yourself, so you can hand off keys cleanly: promote the new owner, then leave or demote yourself. Pairs with the last-owner refusal — a sole owner can transfer first, then leave.
v0.1.41Member management — change role, remove member, JWT-staleness check- Shipped
Per-member role change + remove on /members
Each member row now has a role dropdown (owner / admin / contributor / viewer / auditor) and a Remove button visible to owners and admins. Refuses to demote or remove the last owner — every workspace must always have at least one. Refuses self-demotion from owner unless another owner exists.
- Shipped
Member removal moves the user to a personal tenant
Rather than soft-deleting (which would orphan their externalId), a removed member is moved into a fresh personal tenant where they're the owner. They keep their account, can sign in normally, and get a clean solo workspace; they no longer see any of the original tenant's data. Pairs with the JWT-staleness check.
- Shipped
JWT-staleness detection (forces re-sign-in after tenant change)
The (app) layout now compares the session's tenantId against the user's actual tenantId in DB on every authenticated page load. A mismatch (member moved by an admin, invite accepted, etc.) redirects to /sign-in?moved=1 with a friendly banner so the next sign-in carries the correct tenantId in the JWT.
- Shipped
Pending-invites banner on /dashboard
Surfaces invites pending for the signed-in user's email in OTHER tenants — so a user who already signed in normally (without clicking the invite email link) can still discover and accept the invite without digging up the original message.
v0.1.40Email diagnostics, audit log type chips, onboarding tips- Shipped
Self-diagnostic for email config on /members
Admins now see an "Email diagnostics" panel on /members showing API key status, from-address, invite-link base URL, plus a one-click "Send test" button. Routes through the same Resend client as the real invite, so a green response proves the production path. Failure copy distinguishes "no API key" from "domain probably not verified" with a direct link to the Resend domains dashboard. Closes the "invite said sent but never arrived" feedback.
- Improved
Audit log type filter — every event type is filterable
Recent additions (DLP findings, anomaly signals, billing transitions, check-in/out) were emitting events but missing from the /audit filter chips. Added groups for "Check-in / check-out", "DLP", "Anomaly detection", and renamed "Tenancy" → "Tenancy & billing" to cover plan-changed + payment-failed.
- Shipped
"What to try first" tips on the dashboard
Once a workspace has documents (sample-loaded or real), a 4-card panel surfaces the routes a new admin most needs to find — Ask Kodori (⌘K), search, legal-holds + retention, billing. Dismissible per-browser; once dismissed, stays gone.
v0.1.39Conversation search + CSV migration- Shipped
Full-text search across past conversations
New search input above the History rail in the agent drawer. Type any keyword (or quoted phrase, or -exclusion) and Kodori runs Postgres FTS over every message you've authored or received in this workspace, returning ranked matches with a snippet showing the matched terms in context. Empty the input to fall back to the recent-first list. Backed by a GIN expression index on agent_messages.content (migration 0026) so it stays fast as the corpus grows.
- Shipped
CSV bulk metadata import (incumbent-DMS migration tooling)
New /migrate admin page accepts a CSV mapping documents (by filename) to sensitivity, collection, retention class, and arbitrary metadata fields. Two-step preview-then-commit flow: parse + match against existing docs to show per-row outcomes (matched / unmatched / warning), then commit applies each matched row through the same single-doc tools the UI uses. 10k row / 10 MB caps. Built for the typical incumbent-DMS cutover: drag the bytes onto /upload first, export your metadata as CSV, then run this. /help/csv-migration documents the column shape.
v0.1.38Bulk operations on /search results- Shipped
Multi-select + bulk action bar on /search
Each result row gets a checkbox; selecting one or more pops a sticky action bar at the top with selected count, "Add to collection" picker, "Apply retention class" picker (admin-only), and "Trash…" with a required reason. A 200-doc matter is one set of 50 selections + a click instead of 50 trips to /doc/[id].
- Shipped
Server actions loop through existing single-doc tools
bulkAddToCollectionAction / bulkApplyRetentionAction / bulkTombstoneAction iterate the selected ids and call the existing addDocumentToCollection / setDocumentRetentionClass / tombstoneDocument MCP tools — same permission gates, same audit-log shape, just iterated. Failures on one doc (e.g. legal hold blocking tombstone) don't roll back the others; the result banner reports "30 succeeded, 2 failed" with the per-doc reason. Cap of 100 docs per call to keep audit-log fan-out bounded.
- Fixed
Vercel build error from sync export in a "use server" file
The recent agent-conversations changeset added a sync deriveConversationTitle helper inside a "use server" module — Next.js requires every export to be async (each becomes an RPC endpoint). Moved to apps/web/lib/agent-conversation-title.ts. Local typecheck doesn't catch this, only `next build` does.
v0.1.37Annual prepay — Monthly / Annual toggle on /pricing and /settings/billing- Shipped
Monthly / Annual toggle on the pricing page
The annual price IDs were already wired in env but the UI only ever sent users to monthly checkout. Now /pricing has a Monthly / Annual segmented control above the tier grid; flipping to Annual swaps the displayed prices ($25/$65 instead of $30/$80), updates the cadence label to "billed annually," and encodes the choice into the Subscribe CTA so it carries through sign-in into checkout.
- Shipped
Auto-checkout when arriving from /pricing
A visitor who picks a plan + period on /pricing and clicks Subscribe lands on /settings/billing with the choice in the URL. The page reads ?plan and ?period, fires Stripe Checkout once on mount, and bounces straight to the hosted form — saves a click and avoids the "wait, didn't I already pick this?" friction.
- Improved
/pricing copy updated to reflect load-bearing caps
The "Why per-seat caps?" section was still saying caps were advisory — that shipped before D70 and was now wrong. Updated to "Caps are load-bearing" with the correct fallback semantics (Opus auto-falls-back to Haiku, everything else hard-refuses).
v0.1.36Free is permanent — dropped the artificial 14-day countdown- Improved
"Free trial" → "Free, forever"
The Free tier is genuinely affordable to run forever (cost analysis bounded each user at ~$1/mo), so the 14-day countdown banner was dishonest framing for a tier that doesn't actually expire. Renamed across /pricing, /settings/billing, and the help knowledge base. New Free workspaces no longer get a `trialEndsAt` stamp at signup; the banner only fires for genuine Stripe-managed paid trials (`subscription_status=trialing`). The conversion driver — the 1-seat cap — is unchanged.
- Shipped
PDF extraction cap enforced at the Inngest entry
Closes a small leak in the per-action cap pass: a Free tenant could upload 100 PDFs at once and trigger 100 × Claude vision API calls before any check fired. The extract Inngest handler now calls requireQuota("pdf.extract") before fetch-blob runs, marks over-cap docs as failed with a clear errorMessage, and emits a content-extraction-failed event. The bulk-extract sweep on /dashboard re-runs them once the next billing cycle resets or the tenant upgrades.
v0.1.35Citation chips — agent answers show document names, not UUIDs- Shipped
Document references in agent responses render as named chips
Previously the agent's answers showed raw UUIDs ("I read `34f5a72b-...`") which were clickable but unreadable. Now any /doc/<uuid> link in a streamed response becomes a permission-trimmed chip showing the document's display name ("DOC Smith v Acme NDA"). Cache is module-scoped so 20 chips for the same conversation = 1 server-action call. UUIDs the user can't read fall back to a truncated label without leaking metadata.
v0.1.34Workspace billing-state banner + 14-day trial countdown- Shipped
Top-of-shell banner surfaces past_due, canceled, and trial-ending states
Admins now see a workspace-wide banner above every authenticated page when something needs attention: a red bar for past-due payments ("update payment method") and pending cancellations ("access ends in N days"), an amber bar in the final 3 days of a free trial ("trial ends in N days"). Non-admins don't see the banner — they get the cap-exceeded surfaces inline where they actually try to act.
- Shipped
14-day trial countdown starts on tenant create
New workspaces now get trialEndsAt = createdAt + 14 days, so the trial countdown is meaningful from the very first sign-in instead of waiting on a Stripe trial. The Stripe webhook overwrites the stamp with the longer Stripe-side trial if a paid subscription activates with one configured.
v0.1.33Agent UX polish — persistent conversations, history rail, usage footer- Shipped
Conversations persist across refreshes
Two new tables (agent_conversations + agent_messages) keep every chat between you and the agent. The drawer now resumes the same thread after a reload and the API route saves the user message before streaming, so an interrupted stream still leaves the question on the thread. Persistence is per-user — the agent is a personal assistant, not a team channel.
- Shipped
History rail in the expanded drawer
Open the agent and click Expand: a 240px rail on the left lists your recent conversations newest-first. Click any row to switch threads, hover to surface a delete affordance, click "+ New chat" to start fresh. The narrow side-panel layout stays single-column.
- Shipped
"+ New" button + footer usage strip
+ New in the header starts a fresh conversation without leaving the drawer. A new footer line ("Questions 12 / 600 · Opus 3 / 120 · Team →") shows your monthly cap utilization with the same amber-at-80% / red-at-100% bands as /settings/billing, plus a click-through to manage the plan.
v0.1.32Plan caps are now load-bearing — refuse, fallback, or warn- Shipped
Per-action cap enforcement at four hot paths
Central requireQuota helper gates the agent chat route (refuses with a 402 + upgrade message when over the monthly question cap), the upload register actions (refuses over storage or document-count cap), and the invite create + accept flow (refuses when adding a seat would exceed the cap). The chat panel renders the cap reason as an "Upgrade plan →" link; the upload dropzone surfaces the same message inline. /help/billing-and-plans has the full list.
- Shipped
Opus reasoning auto-falls-back to Haiku at cap
When a Team user hits 40 Opus queries / month but still has Haiku budget, the agent silently downgrades to model: 'fast' instead of refusing. You keep getting answers; the dashboard banner tells you what's happening. The 18× price difference between Opus and Haiku makes graceful-degrade strictly better than block-entirely.
- Shipped
Cap warning banner on the dashboard at ≥80%
Admins see an amber banner on /dashboard when any cap is at 80–99% ("You're at 87% of your monthly agent questions… upgrade to Business before you hit 100%"), and a red banner when at-cap ("New requests are refused until you upgrade or the next billing cycle resets"). Owner-tier banners link straight to /settings/billing; non-owner admins link to /pricing.
v0.1.31Stripe billing — paid tiers with usage caps- Shipped
Four pricing tiers backed by Stripe
Free trial / Team ($30/seat/mo) / Business ($80/seat/mo) / Enterprise. Caps grounded in actual per-token cost data — Team's 200 questions/seat/mo lands at ~67% gross margin worst-case; Business unlocks unlimited Opus reasoning + BYO-key encryption + SAML for the firms that need it. /pricing rewritten with the real tiers + a "Why per-seat caps?" honesty section explaining the cost math.
- Shipped
Self-serve checkout + Customer Portal
Owner-only "Subscribe" button on /settings/billing mints a Stripe Checkout session; "Manage subscription" launches the Customer Portal where customers handle plan switches, payment methods, invoices, and cancellation. Webhook syncs subscription state back into the tenants table and emits tenant.plan-changed / tenant.payment-failed audit events.
- Shipped
Per-tenant usage strip on /settings/billing
Live usage bars against the active plan's caps — seats, storage, documents, agent questions this month, Opus reasoning this month, PDF extractions. Bars colour amber at 70% and red at 90% so admins see the upgrade signal before it starts hurting. Caps are advisory in this build; per-action enforcement (block when over cap) is the immediate follow-up.
- Roadmap
Per-action cap enforcement
Caps surface today; enforcing them at the agent + extraction call sites lands next, with soft warnings on the dashboard banner and an in-flight upgrade prompt when a request would exceed the active tier's budget.
v0.1.30Collection-level access grants — share a matter in one click- Shipped
Share a whole collection with a teammate, not a single doc at a time
/collections/[id] gets a "Collection access" section (owner / admin only). One grant covers every document pinned to the collection — and any pinned later. A 200-doc matter is one click instead of 200. Three new MCP tools (grantCollectionPermission, revokeCollectionPermission, listCollectionPermissions) so the agent can do the same thing from natural language.
- Shipped
Doc-detail Access panel shows inherited grants
The "Access" section on every document page now lists three categories: implicit (creator + role-based admins), explicit (per-doc grants), and inherited (via any collection the doc is pinned to that the principal has been granted on). Each inherited row links back to the collection so it's clear where the grant lives.
- Improved
Deny-wins now applies at both document and collection levels
A per-doc deny still overrides every allow on that doc, including a collection-level grant. So you can share a 50-doc Matter with paralegal Bob but lock down the one privilege-protected doc — Bob still sees the other 49.
v0.1.29Invite emails via Resend, friendlier permission gates, in-place agent expand- Shipped
Invites now arrive by email — Resend integration
Creating an invite from /members emails the recipient a branded link automatically; the URL stays visible in the pending list as a manual fallback. Each pending row gets a "Resend email" button next to "Revoke" — re-sending uses the same token, so links you already shared keep working. Graceful degrade: if email infra isn't configured the invite still creates and the manual copy-link workflow continues unchanged. /help/invite-teammates.
- Improved
Friendlier permission gates on admin-only pages
Non-admin users hitting /anomalies, /costs, or /encryption directly now see a "Permission required" page that names their current role and points them at /members to ask an owner / admin to promote — replaces the earlier 404. Belt-and-suspenders: those nav links are also hidden in the sidebar and mobile nav for non-admins, so the surface stays clean either way.
- Improved
Ask Kodori — in-place agent expand
The dashboard "Ask the agent" button is now "Ask Kodori" and opens the side drawer in place via a window event instead of routing to /agent. The drawer header has a new Expand toggle that fills the content area between the workspace sidebar and the right edge — replaces the dedicated full-page /agent UX with an in-context expansion that keeps your nav visible. State persists across navigations.
v0.1.28Multi-user onboarding, cost dashboard, BYO-key foundation, AEC metadata, eval harness- Shipped
Multi-user onboarding (invite acceptance moves user across tenants)
Closes the Phase-1 leftover that made invites useful only inside a single tenant. Globally-unique partial index on users.external_id; upsert finds users by external_id first; invite acceptance now updates tenant_id and bounces through sign-in. /help/multi-user-onboarding.
- Shipped
Cost dashboard with per-tenant spend telemetry
New cost_events table (microcent resolution) + cost-tracker module wraps Anthropic / OpenAI / R2 / PDF-extract spend. Agent chat instrumented at the streamAgent boundary. /costs page (owner / admin only) shows 30-day total, spend by kind, 14-day trend, and top 20 events. /help/cost-dashboard.
- Shipped
Audit log: date range + actor filter + CSV export
/audit gets calendar-picker date filters, actor substring search across user emails / actor IDs / actor kinds, and an Export CSV ↓ link composing the same filters into /api/audit/export. RFC 4180, 50k-row cap with explicit truncation comment. /help/audit-csv-export.
- Shipped
AEC metadata schema + setAecMetadata MCP tool
Typed project / phase / discipline / specSection / drawingNumber / revision / sheet / currentSet on every document, stored under metadata.aec. Setting currentSet=true auto-demotes prior currents in a single jsonb_set update — exactly one current revision per drawing per project. /help/aec-metadata.
- Shipped
BYO-key foundation: tenant_kms_keys + KMS provider interface
Schema + abstraction for customer-managed envelope encryption. /encryption admin page registers AWS KMS / Azure Key Vault / GCP KMS keys; rotation is additive (registering a new key auto-retires the prior). Default kodori-managed provider gives every tenant envelope encryption out of the box; cloud KMS SDK integration lights up per-deployment. /help/byo-key-encryption.
- Shipped
Per-tool agent eval harness
New @kumokodo/evals package with fixture-driven tests for the deterministic subsystems (predicate DSL, DLP scanner) plus a runFixture(suite, fixture, ctx) harness for tool-handler tests. 16/16 pure-function fixtures pass; tool-handler fixtures land alongside the test-tenant helper.
- Shipped
Brand mark + brand asset folder + favicon + OG refresh
Strata icon + Kōdori wordmark with ochre macron deployed site-wide. /brand/ folder ships nine SVG variants. Favicon adapts to dark browser tabs. OG image refreshed with Cormorant Garamond fetched from Google Fonts. pnpm logo:export rasterizes to favicon / OG / app-icon / signature PNGs.
- Roadmap
Phase 3 nearing complete
Tamper-evident audit ✓, DLP-on-ingest ✓, Anomaly detection ✓, BYO-key foundation ✓, audit improvements ✓. Remaining Phase-3 items: Office add-in foundations (needs Microsoft App Studio scaffolding outside this repo), policy engine (Cedar / OPA), and SOC 2 Type I audit engagement (process, not code).
v0.1.27Anomaly detection with agent step-up- Shipped
15-minute anomaly sweep cron + 5 detector kinds
Inngest cron scans every tenant's last hour of audit-log activity for high-volume regulated reads, cross-sensitivity bursts, off-hours bursts (vs the actor's own 7-day baseline), held-document read spikes, and agent volume spikes. Signals dedupe within a 24-hour window keyed on (tenantId, actorId, kind). /help/anomaly-detection.
- Shipped
High-severity AGENT signals auto-pause via deny rule
A high-severity signal whose actorKind=agent automatically writes a deny rule on /permissions for that agent principal. The principal is paused — every tool call refuses until an owner / admin lifts the rule via the unpause flow on /anomalies (requires a written rationale, captured on the audit log). Users never auto-pause; the false-positive risk demands human judgment.
- Shipped
/anomalies queue + /compliance banner
Owner / admin only. Three sections: auto-paused (active deny rules pinned to the top), open (awaiting decision), decided (the last 50 acknowledged or dismissed). Each row carries the evidence the detector captured plus acknowledge / dismiss / unpause actions with audit events. /compliance gets a red banner whenever an agent is auto-paused, an amber banner whenever any open signal exists.
- Shipped
Manual "Scan now" button
Triggers the same detector pass the cron runs, scoped to the caller's tenant. Useful for triage — investigate an actor on /audit, run the sweep manually, see counts update without waiting 15 minutes.
- Shipped
Five new audit event types
anomaly.detected (per signal at first detection), anomaly.acknowledged, anomaly.dismissed, anomaly.auto-paused, anomaly.unpaused. All on a dedicated stream so the trail of "who decided this was real, when, and what they did" is queryable end-to-end.
- Roadmap
Phase 3 progress: tamper-evident, DLP, anomaly all live
Three of the four foundational Phase 3 deliverables are now live (tamper-evident audit chain, DLP-on-ingest, anomaly detection + step-up). BYO-key via KMS and Office add-in foundations are next.
v0.1.26DLP on ingest + retention auto-apply rules- Shipped
Pattern-based DLP scanning on every upload
Every document's extracted text is pattern-scanned for US SSNs, Luhn-validated credit-card numbers, ABA-validated routing numbers, MRN identifiers, AWS access keys, GitHub tokens, PEM private-key blocks, JWTs, and generic API secrets. The matched value is never stored — only a pre-redacted preview ("XXX-XX-1234"). Findings stream into a new dlp_findings table tied to the document. /help/dlp-scanning.
- Shipped
High-confidence findings auto-escalate to "regulated"
A Luhn-validated credit card, multiple SSNs, a validated routing number, an AWS access key, or a private-key block bumps the document's sensitivity to regulated the moment the scan lands — before the document is searchable. The escalation emits a document.sensitivity-changed event with actorKind=system and an explicit reason so the audit trail captures exactly why the upgrade happened. The doc never appears at a lower tier than its content warrants.
- Shipped
DLP review surface on the document detail page
Medium-confidence findings (single SSN, single MRN, JWT, generic secret) appear on a new <DlpFindingsPanel> with confirm / dismiss buttons. Each decision emits a metadata-set event. Tenant-wide DLP overview shipped on /compliance with status counts + an active-findings-by-type chart.
- Shipped
Retention auto-apply rules
Map a docType pattern to a retention class — Kodori auto-suggests the right class for every NDA, invoice, 1099, etc. Manage rules at the bottom of /retention. Acceptance is still human; the rule never auto-mutates retention, just adds the proposal alongside sensitivity / collection / keywords on the suggestion panel. /help/retention-auto-apply.
- Shipped
New retention-class suggestion kind
metadata_suggestions now supports a "retention-class" kind. The suggestion panel renders it inline; accepting delegates to setDocumentRetentionClass so the audit log handles it identically to a manual change.
- Roadmap
Phase 3 progress: SOC 2 Type I + DLP foundations now in place
DLP-on-ingest and the auto-escalation pipeline are foundational Phase 3 deliverables. Anomaly detection (suspicious agent / user volume), BYO-key via KMS, and the Office add-in foundations are next on the Phase 3 roadmap.
v0.1.25Phase 1 + Phase 2 closeout: 9 slices in one batch- Shipped
Rule-driven Collections
Describe a Collection in declarative terms — "every regulated PDF", "all NDAs from 2024" — and matching documents auto-include at read time. Typed predicate DSL (sensitivity, MIME family, name contains, metadata keyword, date bounds) AND-combined into 1–8 predicates. Inline editor on /collections/[id] plus a setCollectionRule MCP tool the agent uses. See /help/rule-driven-collections.
- Shipped
Annotations layer: notes alongside documents
Per-document commentary (notes today; highlights / tags ride the same schema). Permission-trimmed to the doc's readers; audit log records the existence-fact via annotation.added / annotation.removed events. The agent can author notes from natural language. /help/annotations.
- Shipped
Semantic legal-hold preview
New "Find subjects by description" panel on /legal-holds/[id] runs hybrid search, marks already-bound docs, and bulk-binds the unbound subset in one click. Removes the UUID copy-paste tax incumbent DMS make a paralegal pay during litigation week. /help/legal-hold.
- Shipped
Agent activity page (/agent-activity)
Discoverable view of every action the AI agent has taken in the workspace over the last 30 days — top-line stats, top-tools bar chart, recent timeline. Maps to the Phase-2 "agent logs as first-class artifact" deliverable. /help/agent-activity-page.
- Shipped
Saved searches: new-since-last-viewed badges
Each saved-search chip on /search carries a numeric badge counting documents matching its query that landed since you last opened it. Click and the badge clears. Per-user, FTS-counted, no email-digest infra needed. /help/saved-search-new-badge.
- Shipped
Bulk ingest API
POST /api/v1/documents with raw bytes as the request body and metadata in X-Kodori-* headers. 50 MB cap, content-addressed dedup is free, full extraction + classification pipeline runs identically to a UI upload. Built for watch folders, ETL jobs, scanner integrations. /help/bulk-ingest-api.
- Shipped
AP invoice review workflow (Phase 1 exit)
Upload an invoice → auto-classify flags it → Haiku extracts vendor / total / PO / currency → /ap-review queue presents it for approval → approve or reject with a reason → ap-invoice.approved/rejected events fan out via webhooks. PO match by display-name lookup links the invoice to a PO doc when one exists. /help/ap-invoice-workflow.
- Shipped
Pre-trained doc-type hints (top-20 US categories)
Deterministic pattern matcher recognizes NDA, MSA, engagement letter, subpoena, court filing, invoice, PO, receipt, expense report, W-9 / W-2 / 1099, tax return, RFI, submittal, change order, meeting minutes, inspection report. Runs alongside the LLM auto-classifier; unambiguous documents get tagged consistently with no model call. /help/doc-type-hints.
- Shipped
Check-in / check-out (soft edit lock)
Claim an exclusive edit window on /doc/[id] before working on a document. While held, other workspace members can't upload a competing new version. Uploading clears the lock atomically; admins can force-release. /help/check-in-check-out.
- Roadmap
Mobile + Slack / Workspace / M365 integrations explicitly deferred
Phase 2's native mobile and first-party Slack / Google Workspace / Microsoft 365 deliverables are deferred to Phase 4 (mobile, paired with AEC field workflow) and Phase 3 (read-only integrations alongside the Office add-in foundation). Webhooks + the public REST API give every tenant the integration substrate today.
- Roadmap
Phase 3: SOC 2 Type I + Office add-ins on deck
Next phase enters SOC 2 Type I audit engagement, formal HIPAA technical-safeguards review, BYO-key via KMS, DLP engine, and the Outlook / Word / Excel / PowerPoint add-in foundations.
v0.1.16Webhooks: outbound HTTP delivery with HMAC signing- Shipped
/webhooks page for issuance, pause/resume, revoke, and a recent-deliveries log
Owners + admins create subscriptions by pointing at an HTTPS URL and optionally listing event-type filters. Plaintext signing secret shown exactly once; sha256 stored at rest. Last 50 deliveries surface with response codes and failure snippets so you can debug a misbehaving endpoint without leaving the app.
- Shipped
webhook-fanout + webhook-deliver Inngest functions
After every successful events.append, an event/appended Inngest event fires; the fanout function matches active subscriptions (tenant + event-type filter) and enqueues per-subscription deliveries; the deliver function POSTs the signed payload, retries up to 4 times via Inngest step-retry config, and records each attempt in webhook_deliveries.
- Shipped
HMAC-SHA256 signing with replay protection
Every delivery carries X-Kodori-Signature: sha256=<hex> computed over <X-Kodori-Timestamp>.<body>. Receivers verify by recomputing the HMAC and rejecting timestamp drift > 5 minutes. /help/webhook-signature-verification has a Node.js example.
- Improved
Event-store wrapper at apps/web/lib/event-store.ts
Wraps the plain @kumokodo/events factory to fire event/appended after every successful append. Bulk-swapped 23 createEventStore call sites in apps/web so every audit append flows through. The domain package stays clean (no Inngest dependency).
v0.1.15Public REST API write scope (12 endpoints, opt-in scopes)- Shipped
Mutation endpoints across documents + collections
POST /api/v1/documents/{id}/{rename,sensitivity,metadata,restore}, DELETE /api/v1/documents/{id} (tombstone), POST /api/v1/collections, POST /api/v1/collections/{id}/members, DELETE /api/v1/collections/{id}/members/{documentId}. setDocumentSensitivity refuses to lower on held docs; tombstone refuses on active hold. Same MCP-tool gates the agent uses — no API-key bypass.
- Shipped
Opt-in scope checkboxes at /api-keys
search:read is the always-granted baseline. Add documents:write, documents:delete, or collections:write at issuance — each off by default. Per-key scope chips render under each active key.
- Improved
Shared authorizeApiRequest helper in lib/api-auth.ts
Bearer-parse + verify + scope-check in one call. Read endpoints (search, list, document read, /me) refactored to use the same helper so route files stay ~30 lines each.
- Improved
OpenAPI 3.1 manifest documents every new route
Full request + response schemas for the 8 new mutation endpoints, including the held-doc deny-wins error path.
v0.1.14OpenGraph images at the edge- Fixed
Replace dead /og-default.png reference
The Organization JSON-LD on the root layout pointed at /og-default.png, which 404'd — every social share of a kodori.ai link rendered bare.
- Shipped
apps/web/app/opengraph-image.tsx generates a brand-matched 1200×630 image at the edge
Cream paper background, ink wordmark, ochre tagline highlight, feature pills along the bottom. Auto-injected as og:image for every marketing page. Per-page overrides land by dropping route-scoped opengraph-image.tsx files when narrative diverges.
v0.1.13Vertical pages, security page, in-app onboarding- Shipped
/security page with technical-controls table and certifications roadmap
Hash-chained audit, deny-wins ACL, SSO-only auth, content-addressable encrypted storage — every control documented with status, plus an honest map of which certifications hold today vs ship in Phase 3 / 5.
- Shipped
/for-accounting and /for-manufacturing-qms landing pages
Vertical-specific pain quotes, feature subsets, and FAQs for CPAs and ISO-9001 / 13485 / IATF-16949 manufacturers.
- Shipped
/compare/netdocuments and /compare/filehold
Honest 13–14 dimension comparisons that call out where each incumbent wins.
- Shipped
Sample-data fixture for empty workspaces
New tenants click "Load sample data" on the dashboard and Kodori seeds 6 documents, 2 collections, 1 legal hold, and 1 retention class so every screen has something to demonstrate.
v0.1.12Agent ↔ UI parity + first vertical pages- Shipped
Six MCP mutation tools so the agent reaches UI parity
renameDocument, setDocumentSensitivity (refuses on held docs), setDocumentMetadata, setVersionLabel, setVersionSignificance, restoreDocument. The agent can now move, rename, retag, soft-delete, and restore documents in plain English — every consequential action behind a "what's your reason?" prompt.
- Shipped
/for-law-firms and /for-construction landing pages
Pain quotes from real review sites, not fabricated testimonials.
- Shipped
/compare/imanage with honest cell-by-cell highlighting
iManage wins on Office add-ins, ethical-walls maturity, current SOC 2 posture; Kodori wins on architecture, search, audit chain, agent autonomy, pricing.
v0.1.11Marketing pass: hero, features, help, SEO- Shipped
/features page enumerating 50+ capabilities across 8 pillars
SoftwareApplication JSON-LD with featureList populated for rich-result eligibility.
- Shipped
/help knowledge base + agent retrieval
Typed help-articles array drives both the human /help surface and the agent's helpKnowledge MCP tool — so anything users can read in /help, the agent can answer with a citable URL.
- Improved
Hero copy reworked
From "for people who read what they sign" to "AI document management your auditor can defend." Homepage feature cells expanded from 6 to 12.
- Shipped
SEO scaffolding
Root metadata + Organization / WebSite / SearchAction JSON-LD; sitemap.ts + robots.ts; per-page metadata on every marketing surface.
v0.1.10Saved searches + sensitivity badges + per-doc history- Shipped
Saved searches per user
Inline "Save this search" button on /search; saved entries appear as one-click chips with delete affordance.
- Shipped
Sensitivity badges across list surfaces
Color-graded chips (public → regulated) on dashboard, search, collection-detail, doc-detail. hybridSearchTool output gained additive sensitivity + mimeType fields.
- Shipped
/compliance overview page
Single-pane governance summary: live record count, active holds, retention queue depth, sensitivity histogram, audit chain tip.
- Shipped
Per-document history timeline
Every event recorded against a document, newest first, payload-aware summaries.
v0.1.9Public REST API + OpenAPI manifest- Shipped
Read-scope REST API at /api/v1
GET /me, POST /search, GET /documents, GET /documents/{id}. Bearer-token auth; permission-trimmed at the index. /api-keys page for issuance + revocation with one-shot secret reveal.
- Shipped
OpenAPI 3.1 manifest at /api/openapi.json
Drop into Postman, Insomnia, or Stoplight Studio for typed request building.
v0.1.8Retention review queue, search filters, vitest- Shipped
Retention review queue (/retention/review)
Records whose retention term has elapsed surface here with two human-confirmed actions: defer N years with a reason, or dispose with a reason. Held docs surface but disposal is gated.
- Shipped
Search filters: sensitivity + MIME family
Dropdowns on /search to narrow to regulated docs only, or PDFs only, etc. Post-RRF filtering keeps semantic-only matches alive.
- Shipped
Vitest suite: 26 tests across packages/workflow + apps/web
Chunker fallthrough, illustrator-ai magic-byte detection, revertable-event predicate, API-key bearer parser. pnpm test runs the suite via Turbo.
v0.1.7Legal hold + tombstone + retention classes + cross-user ACL- Shipped
Legal hold MVP
legal_holds + legal_hold_subjects tables, four MCP tools, /legal-holds list + detail, doc-detail badge + bind/release. Subjects preserved as audit evidence on release.
- Shipped
tombstoneDocument MCP tool with held-doc deny-wins
Soft-delete refuses on any active legal hold. Bytes stay in storage; audit trail stays intact during retention window.
- Shipped
Retention classes + setDocumentRetentionClass + deferRetention
Define class, assign per doc, defer with reason, dispose with reason. /retention page + dropdown on doc detail.
- Improved
Cross-user ACL hardening across server-component reads
Every doc-touching dashboard / collection-detail query now composes canReadDocument(ctx) into its WHERE clause.
v0.1.6Auto-classify, versioning, mobile sweep, illustrator-ai- Shipped
Auto-classification on upload
After extraction succeeds, Claude Haiku proposes sensitivity, collection, keywords, and doc-type. Suggestions show on the doc page; accepting writes the durable mutation and emits a human-decision event.
- Shipped
Versioning UX: upload v2, named versions, text-diff compare
Upload a new version of an existing doc without duplicating the record. Optional inline-editable label per version. Server-rendered text diff between any two versions; older versions re-extract on demand.
- Shipped
illustrator-ai extractor
Sniffs %PDF- magic in .ai uploads and routes Illustrator-9+ files through the PDF extractor.
- Improved
Mobile-responsive across the app
Wide tables wrap in overflow-x-auto; page padding scales by breakpoint; agent drawer goes full-width on phones.
v0.1.5Hybrid search, semantic, Office extraction, email ingest- Shipped
Hybrid search (FTS + pgvector + RRF)
document_chunks table with 1536-dim embeddings + HNSW cosine index. OpenAI text-embedding-3-small. Reciprocal Rank Fusion combines keyword + semantic ranked lists.
- Shipped
Office extraction (.docx / .xlsx / .pptx)
Pure-JS adapters via mammoth, SheetJS, jszip + fast-xml-parser. No native Office, no Pandoc.
- Shipped
Email ingress via Cloudflare Email Workers
Per-tenant ingest address. Worker forwards raw MIME to /api/email/inbound (HMAC-signed); the route parses with mailparser and creates one document per attachment plus one for the message body.
- Shipped
Reversibility UI
/audit page renders revertable events with a one-click revert button. Each revert appends a fresh forward event so the chain is intact.
v0.1.0Phase 0 foundation- Shipped
Monorepo, Auth, schema, MCP, agent runtime
Turborepo + pnpm with 7 packages. Next.js 15.5 on Vercel + Neon Postgres + Cloudflare R2 for content-addressable storage. Auth.js v5 with Google OAuth + JIT user/tenant sync. Internal MCP tool catalog. Vercel AI SDK + Anthropic Claude with prompt caching.
- Shipped
Initial document lifecycle: upload, search, view
Pre-signed PUT for direct-to-R2 upload. Postgres FTS over names + extracted text. Marketing route group with homepage / pricing / about + app route group with dashboard / search / agent / collections / upload / doc detail.
roadmapWhat's next- Roadmap
Webhooks (outbound delivery for document events)
Subscribe a URL to a per-tenant event filter; POST signed deliveries. Lands in Phase 2.
- Roadmap
Write-scope public REST API
POST / PATCH / DELETE endpoints for documents, collections, holds. Behind a richer apiKey.scopes model.
- Roadmap
Office add-ins (Outlook / Word / Excel / PowerPoint)
Native add-in foundations land in Phase 3, shared across all verticals.
- Roadmap
Migration connectors
FileHold, SharePoint Online, OnBase, Documentum, NetDocuments, iManage, Box, Dropbox, Drive. Phase 5 — design partners influence sequencing.
- Roadmap
SOC 2 Type II + 21 CFR Part 11 conformance claim
Type I in Phase 3, Type II in Phase 5. 21 CFR Part 11 conformance claim with the substrate already in place.
Questions, requests, or feedback? hello@kumokodo.ai. Or read the canonical feature catalog and the help knowledge base.