Cost dashboard

Per-tenant spend on Anthropic Claude, OpenAI embeddings, R2 storage, and PDF extraction — with microcent resolution so per-token pricing is captured exactly.

Updated 2026-04-30

Open /costs (owner / admin only) to see your tenant's spend over the last 30 days. Three top-line stats — total dollars, total cost events, top spend kind — followed by a spend-by-kind bar chart, a 14-day daily trend, and the 20 individual highest-cost events for triage.

Pricing is recorded at event time. If a vendor changes pricing, existing events keep their original cost; only future events use the new rate. If a customer disputes a number, the audit log's tool-call entry pairs 1:1 with the cost row that drove it.

What's tracked today: - agent.input-tokens / output-tokens / cache-read / cache-write — Anthropic Claude Opus 4.6 + Haiku 4.5 spend, instrumented at the streamAgent boundary - (coming soon) embedding.tokens — OpenAI text-embedding-3-small for the semantic-search pipeline - (coming soon) classify.tokens — Haiku for auto-classification on upload - (coming soon) storage.put / storage.get — R2 byte counts (informational; R2 storage is metered per-GB-month, not per-byte) - (coming soon) extract.pdf — Claude PDF extraction calls

Microcent (1/1,000,000 of a US cent) resolution captures vendor pricing without floating-point drift across millions of events. The dashboard formats to USD; the underlying rows preserve precision for accountants.

**How the dashboard scales (D281)**

The page used to scan every cost_events row in the 30-day window on each load — fine at SMB volumes, problematic at 100M-doc-tenants where cost_events accumulates millions of rows per day. Today's read path is two-stage: aggregations (totals, by-kind, trend) come from a pre-aggregated cost_events_daily_rollup table for completed UTC days and from raw cost_events for today only. A daily 06:00 UTC Inngest cron (cost-events-rollup-daily) fills the rollup with INSERT ... ON CONFLICT DO UPDATE so re-runs are idempotent. Top-cost detail rows continue to read raw events — the rollup intentionally loses per-event identity to keep storage cheap. Page-loads stay sub-second regardless of underlying volume.