fix(perf): drop old assets query performance#35591
Conversation
|
Claude finished @wezell's task in 3m 46s —— View job PR ReviewThe NULL-handling fix is correct and the new regression test exercises it well. A few things worth a second look before merging. Findings1. Semantic change to 2. Outer query lost 3. Outer query performance claim is unverified. 4. PR description doesn't match the final implementation. 5. Safety-break can mask real failures. 6. Minor — test docstring references SQL that isn't in the final diff. 7. Pre-existing, but visible while you're in here. Style nits (non-blocking)
What's clean
|
Address PR #35591 review feedback: - DropOldContentVersionsJobHelper: replace NOT IN with NOT EXISTS so the old-versions query handles nullable contentlet_version_info.live_inode correctly. NOT IN against a NULL returns UNKNOWN for every row, which silently skipped cleanup for never-published drafts (the dominant case for EFS hardlink bloat). Working/live inodes are still strictly excluded from deletion. - DropOldContentVersionsJob: add no-progress safety break to the outer while(true) loop so a batch that deletes zero versions stops the run for that language instead of spinning the Quartz worker indefinitely. Particularly important under StatefulJob, where a stuck run blocks future scheduled executions. - DropOldContentVersionsJobTest: add regression test covering the NULL live_inode case end-to-end via the helper, asserting the working inode is never returned and the expected old versions still are. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
deleteContentVersion is already @CloseDBIfOpened, so each deletion gets its own transaction boundary. Closing per contentlet on top of that churns connection state for tenants with many short-lived sessions. Move the closeSessionSilently() to a finally block around the batch loop so it just sweeps anything left after the batch — same memory benefit, fewer pool round-trips. Address PR #35591 review. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A single batch (default 500 identifiers) can produce many thousands of version deletions on tenants with heavy file-asset histories — the exact EFS bloat case this job targets. Closing once per batch let too much state accumulate in one Hibernate session. Move closeSessionSilently() back inside the inner for-loop, wrapped in a finally so a delete failure mid-loop still cleans up. Per-contentlet (~500 closes per batch) is the right middle ground vs per-row. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two fixes to DropOldContentVersionsJobTest: 1. deleteOldContentlets — update expected count from 100 → 101. The previous expectation (100 versions remaining out of 105) only passed because the original NOT IN ... UNION ALL query was failing to exclude the working/live inode from deletion candidates: the subquery's NULL handling made NOT IN return UNKNOWN for every row (the bug PR #35591 fixes), so all 105 versions passed the filter and OFFSET 100 returned 5 deletions = 100 remaining. With the corrected NOT EXISTS query, the working/live inode is now properly excluded (1 row), so 104 candidates pass the filter, OFFSET 100 returns 4, and 105 - 4 = 101 versions remain. The "delete 5 leave 100" result was a side-effect of the bug, not the intended behavior. 2. findContentVersionsGreaterThan_protectsWorkingInodeWhenLiveInodeIsNull — replace title "Draft v1 (never published)" with "Draft Never Published 1". The webPageContent type's title field has a regex pattern that rejects parentheses, causing the previous title to fail validation immediately. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
At steady state — a contentlet with working/live + GREATER_THAN old versions (GREATER_THAN + 1 total rows) — the outer FIND_CONTENTS_WITH_VERSIONS_GREATER_THAN matched on > GREATER_THAN total rows, but the inner FIND_CONTENT_VERSIONS_GREATER_THAN query correctly excludes the working/live inode and then OFFSETs by GREATER_THAN, returning 0 candidates. The no-progress safety break in DropOldContentVersionsJob.execute() then warned on every healthy run. Mirror the inner NOT EXISTS exclusion in the outer query so the count reflects actual deletion candidates. The outer now only returns identifiers for which the inner will produce real work. Address PR #35591 review. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ref: #35590
This PR fixes: #35590