chore(d1): add AnalyticsEventFact retention to cron (prod was near 10 GB cap) by whimet · Pull Request #285 · ZenUml/conf-app

whimet · 2026-06-27T09:07:16Z

Why

conf-zenuml-prod D1 hit 9.89 GB / 10 GB (98.9%) — the 10 GB cap is a hard limit, not a billing threshold; writes fail at the wall.

AnalyticsEventFact was ~99.9% of the DB (9.67M rows, only 49 days). ~88% were dead page_viewed telemetry that stopped being written ~Jun 22. Root cause: nothing pruned this table — the daily cron-aggregate only purged UserBehaviorEvent (now empty), and purgeAnalyticsFactRetention() was wired only to a manual API endpoint (default 90d → deletes 0 at 49 days).

What this PR does

cron-aggregate — adds bounded, batched age-based retention for AnalyticsEventFact (default 45 days, tunable via ANALYTICS_FACT_RETENTION_DAYS). Bounds growth going forward. Batched via id-subquery (D1's SQLite has no DELETE ... LIMIT) and capped at 40×50k rows/run so the cron stays within limits and any backlog drains over a few nights.

Backlog already drained (2026-06-27)

The 8.5M dead page_viewed rows were removed out-of-band against prod before this merge:

	before	after
`page_viewed` rows	8,542,688	0
`AnalyticsEventFact` total	9,668,759	1,126,304
reported D1 size	9.89 GB (98.9%)	3.49 GB (34.9%)

D1 reclaimed the space immediately — no VACUUM/support ticket needed. Storage is now under the 5 GB free tier ($0). The one-off drain script has been removed; the remaining 1.13M rows are page_updated (~34k/day), which this PR's cron retention keeps bounded.

Rollout

Merge → pnpm --filter cron-aggregate deploy:prod. That's it — the nightly cron then keeps AnalyticsEventFact within 45 days.

Ops note: wrangler ... --remote can fail with /memberships Authentication error [code: 10000]; export CLOUDFLARE_ACCOUNT_ID=8d5fc7ce04adc5096f52485cce7d7b3d to bypass.

🤖 Generated with Claude Code

…uml-prod near 10 GB cap) conf-zenuml-prod D1 is at 9.89 GB / 10 GB (98.9%). AnalyticsEventFact is ~99.9% of it (9.67M rows, 49 days), and nothing pruned it — the daily cron only purged the now-empty UserBehaviorEvent, and purgeAnalyticsFactRetention was wired only to a manual API endpoint. ~88% of rows are dead `page_viewed` telemetry that stopped being written ~Jun 22. - cron-aggregate: add bounded, batched age-based retention for AnalyticsEventFact (default 45d, tunable via ANALYTICS_FACT_RETENTION_DAYS), so the table is bounded going forward. Batched via id-subquery (D1 SQLite has no DELETE ... LIMIT) and capped per run to stay within cron limits. - scripts/purge-analytics-fact-backlog.sh: one-time batched drain for the dead `page_viewed` backlog (younger than the retention window, so age-based purge won't clear it). Supports --dry-run/--yes, prod|stg. Note: deletes free SQLite pages to the freelist (halts growth, write headroom returns) but the reported/billed D1 size may not drop until D1 compacts. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

whimet · 2026-06-27T09:47:24Z

Backlog drain executed against prod ✅

Ran scripts/purge-analytics-fact-backlog.sh --env prod --event page_viewed --batch 100000 against conf-zenuml-prod.

	before	after
`page_viewed` rows	8,542,688	0
`AnalyticsEventFact` total	9,668,759	1,126,304
reported D1 size	9.89 GB (98.9%)	3.49 GB (34.9%)

86 batches, all 8,542,688 rows deleted cleanly (exit 0).

Correction to the caveat above: D1 did reclaim space — the reported size dropped immediately, no Cloudflare support ticket needed. Storage is now under the 5 GB free tier, so storage cost is $0. Will update the script's inline note to match.

Rollout gotcha: wrangler ... --remote kept failing with /memberships Authentication error [code: 10000] (transient OAuth account resolution). Export CLOUDFLARE_ACCOUNT_ID=8d5fc7ce04adc5096f52485cce7d7b3d to bypass it.

The remaining 1.13M rows are page_updated (still ingesting ~34k/day); the cron retention in this PR will keep AnalyticsEventFact bounded to 45 days going forward.

…og drain D1 reclaimed space immediately after the page_viewed delete — correct the earlier 'size may not drop' caveat in the script and cron comments. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

The backlog has been drained from conf-zenuml-prod (8.5M page_viewed rows; 9.89->3.49 GB). The nightly cron retention added in this PR handles ongoing bounding, so the one-shot script is no longer needed. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

whimet temporarily deployed to staging-lite June 27, 2026 09:07 — with GitHub Actions Inactive

docs(d1): record observed compaction (9.89->3.49 GB) after prod backl…

1cb5575

…og drain D1 reclaimed space immediately after the page_viewed delete — correct the earlier 'size may not drop' caveat in the script and cron comments. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

whimet had a problem deploying to staging-lite June 27, 2026 09:47 — with GitHub Actions Error

whimet had a problem deploying to staging-lite June 27, 2026 09:52 — with GitHub Actions Error

whimet temporarily deployed to staging-lite June 27, 2026 09:54 — with GitHub Actions Inactive

whimet changed the title ~~chore(d1): AnalyticsEventFact retention + backlog drain (prod near 10 GB cap)~~ chore(d1): add AnalyticsEventFact retention to cron (prod was near 10 GB cap) Jun 27, 2026

whimet merged commit 4c39c38 into main Jun 27, 2026
18 checks passed

whimet deleted the chore/analytics-fact-retention branch June 27, 2026 10:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

chore(d1): add AnalyticsEventFact retention to cron (prod was near 10 GB cap)#285

chore(d1): add AnalyticsEventFact retention to cron (prod was near 10 GB cap)#285
whimet merged 3 commits into
mainfrom
chore/analytics-fact-retention

whimet commented Jun 27, 2026 •

edited

Loading

Uh oh!

whimet commented Jun 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

whimet commented Jun 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why

What this PR does

Backlog already drained (2026-06-27)

Rollout

Uh oh!

whimet commented Jun 27, 2026

Backlog drain executed against prod ✅

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

whimet commented Jun 27, 2026 •

edited

Loading