Skip to content

fix: keep the folder id when renaming it (#35399)#35576

Open
gortiz-dotcms wants to merge 6 commits intomainfrom
issue-35399-folder-rename-keep-id
Open

fix: keep the folder id when renaming it (#35399)#35576
gortiz-dotcms wants to merge 6 commits intomainfrom
issue-35399-folder-rename-keep-id

Conversation

@gortiz-dotcms
Copy link
Copy Markdown
Member

@gortiz-dotcms gortiz-dotcms commented May 5, 2026

Summary

Fixes #35399

Before this change, renaming a folder deleted the old identifier row and inserted a new one with a different UUID. On push-publish receivers this appeared as a brand-new folder at the new path while the original folder (old UUID) stayed orphaned, requiring manual cleanup.

What changed

Stable identifier on rename (FolderFactoryImpl)

The rename operation now updates identifier.asset_name in place instead of replacing the row. The folder UUID is unchanged after a rename. full_path_lc is a generated column so it updates automatically when asset_name or parent_path change.

Child path updates (updateChildPaths)

parent_path for all identifiers in the renamed subtree — including folder identifiers — is updated in bulk, sorted by ascending path depth to satisfy the identifier_parent_path_trigger ordering requirement.

Push-publish detection

  • version_ts is bumped for every contentlet in the renamed subtree so push-publish picks them up.
  • mod_date is bumped for every child sub-folder via a LIKE-based subquery so DependencyModDateUtil.excludeByModDate includes them in the bundle.

Receiver reconciliation (FolderHandler)

Added a fallback lookup: if the receiver cannot find the folder by path or by inode, it searches by the identifier's existing path. This handles independently-provisioned environments where the same identifier UUID exists at a different path after a previous rename that was not yet pushed.

Cache eviction

The root folder's FolderCache entry is now evicted before the UPDATE identifier, using the pre-rename Identifier object. This removes a race where a concurrent thread could clear the identifier cache between the UPDATE and save(), causing save() to evict the wrong (new-path) key.

Tests updated

FolderAPITest.renameFolder and FolderAPITest.renameFolder_updatesChildrenAndSubChildrenPaths were rewritten to assert stable-UUID behaviour: identifier UUIDs unchanged after rename, asset_name / parent_path updated in place, sub-folder mod_date bumped, and old paths absent from the DB.

🤖 Generated with Claude Code

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 5, 2026

Claude finished @gortiz-dotcms's task in 4m 32s —— View job


dotCMS Backend Review

  • Load dotCMS conventions
  • Get the Java diff (656 lines)
  • Spawn 4 specialized sub-agents (Security, Database, Java Standards, REST API)
  • Consolidate findings
  • Post review comment

Result: 3 findings posted to the backend review comment.

  • 🟠 HighFolderFactoryImpl.java:961-988: bumpVersionTsForSubtree scans the identifier table twice per rename (explicit SELECT + correlated subquery in UPDATE). This regression undoes a fix from the previous review iteration.
  • 🟡 MediumFolderFactoryImpl.java:813-850: The renamed root folder's own IdentifierCache entry is evicted at L850 (after the UPDATE/save()), not at L813-819 with the rest of the pre-UPDATE eviction. clearIdentifierCacheForSubtree(oldPath, ...) covers descendants only, so the root has a stale-cache window.
  • 🟡 MediumFolderFactoryImpl.java:1013-1021: bumpModDateForSubFolders runs a third parent_path LIKE scan on identifier; subFolderSnapshot already has every sub-folder inode in scope and could drive the UPDATE by inode instead.

Security, Java Standards, and REST API reviewers returned no findings.
· Branch: issue-35399-folder-rename-keep-id

@github-actions github-actions Bot added the Area : Backend PR changes Java/Maven backend code label May 5, 2026
@semgrep-code-dotcms-test
Copy link
Copy Markdown

Semgrep found 1 CUSTOM_INJECTION-2 finding:

  • dotCMS/src/main/java/com/dotmarketing/portlets/folders/business/FolderFactoryImpl.java

The method identified is susceptible to injection. The input should be validated and properly
escaped.

If this is a critical or high severity finding, please also link this issue in the #security channel in Slack.

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 5, 2026

🔍 dotCMS Backend Review

[🟠 High] dotCMS/src/main/java/com/dotmarketing/portlets/folders/business/FolderFactoryImpl.java:961-988

bumpVersionTsForSubtree runs the same parent_path LIKE ? ESCAPE '\\' scan on identifier twice per rename: once explicitly at L961-968 to collect the ids needed to call IdentifierCache.removeContentletVersionInfoToCache(...) (L991-997), and again as a correlated sub-select inside the UPDATE contentlet_version_info ... WHERE identifier IN (SELECT i.id FROM identifier i WHERE i.parent_path LIKE ? ...) at L979-988. This is the same regression the previous review iteration flagged — the earlier shape passed an IN (?, ?, …) list built from the already-collected ids, which avoided the second scan. With the current shape every rename of a wide subtree pays for the parent_path scan twice and holds locks across both, on the hottest path in the rename. The sibling bumpModDateForSubFolders (L1013-1021) uses the subquery shape because it has no preceding SELECT, so the pattern is correct there — but here the SELECT is unavoidable for cache eviction, so reusing ids is the natural fix.

final List<Map<String, Object>> affected = new DotConnect()
    .setSQL("SELECT i.id FROM identifier i WHERE i.parent_path LIKE ? ESCAPE '\\' ...")
    .addParam(likeParam).addParam(hostId).loadObjectResults();
...
new DotConnect().executeUpdate(
    "UPDATE contentlet_version_info SET version_ts = ?"
        + " WHERE variant_id = ?"
        + "   AND identifier IN ( SELECT i.id FROM identifier i WHERE i.parent_path LIKE ? ... )",
    new Date(), VariantAPI.DEFAULT_VARIANT.name(), likeParam, hostId);

💡 Build IN (?, ?, …) placeholders from ids and pass [new Date(), DEFAULT_VARIANT, ids...]. If unbounded IN on huge subtrees is a concern, chunk in batches of ~500 — still cheaper than two full scans. Fix this →


[🟡 Medium] dotCMS/src/main/java/com/dotmarketing/portlets/folders/business/FolderFactoryImpl.java:813-850

clearIdentifierCacheForSubtree(oldPath, hostId) at L813 evicts identifiers whose parent_path LIKE 'oldPath%', but the renamed folder's own identifier has parent_path = parentPath (its grand-parent path), so it is not covered. folderCache.removeFolder(folder, ident) at L818 evicts FolderCache only. The renamed root's IdentifierCache entry is not evicted until L850, after the in-place UPDATE identifier SET asset_name = ? at L822, after save(folder) at L828, and after updateChildPaths/bumpVersionTsForSubtree/bumpModDateForSubFolders (L832-837). Anything in this window that goes through IdentifierAPI.find()IdentifierCache.getIdentifier(folder.getIdentifier()) either reads the stale pre-UPDATE asset_name (cache hit) or repopulates the cache from a concurrent transaction's pre-commit view. The comment at L815-817 explicitly states the goal is to evict before mutating the DB to avoid that race — but the root identifier is the one entry that comment doesn't actually cover.

clearIdentifierCacheForSubtree(oldPath, hostId);          // descendants only — root not covered
evictContentletCacheForSubtree(oldPath, hostId);
folderCache.removeFolder(folder, ident);                   // FolderCache only
new DotConnect().executeUpdate("UPDATE identifier SET asset_name = ? WHERE id = ?", ...);
folder.setName(newName); folder.setModDate(new Date()); save(folder);
...
CacheLocator.getIdentifierCache().removeFromCacheByIdentifier(folder.getIdentifier()); // L850 — too late

💡 Move the IdentifierCache.removeFromCacheByIdentifier(folder.getIdentifier()) call up to L819 (right after folderCache.removeFolder), pre-UPDATE, and drop the duplicate at L850. Eviction-before-write matches the L815-817 comment. Fix this →


[🟡 Medium] dotCMS/src/main/java/com/dotmarketing/portlets/folders/business/FolderFactoryImpl.java:1013-1021

bumpModDateForSubFolders runs a third parent_path LIKE ? ESCAPE '\\' scan on identifier (after the two scans in bumpVersionTsForSubtree and the eviction queries earlier in the rename). Every sub-folder inode is already in scope inside renameFolder via the subFolderSnapshot captured at L810, so this UPDATE folder could be driven by inode (its primary key) instead of by re-scanning identifier and joining back. Lower priority than the High finding above because the join targets a smaller working set, but the snapshot is the natural reuse point.

final int updated = new DotConnect().executeUpdate(
    "UPDATE folder SET mod_date = ?"
        + " WHERE identifier IN ( SELECT id FROM identifier"
        + "   WHERE parent_path LIKE ? ESCAPE '\\' AND host_inode = ? AND asset_type = 'folder' )",
    new Date(), likeParam, hostId);

💡 Pass subFolderSnapshot (or the inode list extracted from it) into bumpModDateForSubFolders and switch to UPDATE folder SET mod_date = ? WHERE inode IN (?, ?, …). Same chunking caveat as the High finding if subtree size can be unbounded.


Next steps

  • 🟠 Fix locally and push — the duplicate scan undoes the previous review iteration
  • 🟡 You can ask me to handle mechanical fixes inline: @claude move the identifier cache eviction to before the UPDATE in FolderFactoryImpl.renameFolder
  • Every new push triggers a fresh review automatically

@semgrep-code-dotcms-test
Copy link
Copy Markdown

Semgrep found 6 CUSTOM_INJECTION-2 findings:

The method identified is susceptible to injection. The input should be validated and properly
escaped.

If this is a critical or high severity finding, please also link this issue in the #security channel in Slack.

- Replace unbounded IN clause in bumpVersionTsForSubtree with correlated
  subquery, consistent with bumpModDateForSubFolders
- Sanitize path variables in Logger.debug calls to prevent log injection
- Add missing @OverRide to FolderAPIImpl.renameFolder

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@semgrep-code-dotcms-test
Copy link
Copy Markdown

Semgrep found 7 CUSTOM_INJECTION-2 findings:

The method identified is susceptible to injection. The input should be validated and properly
escaped.

If this is a critical or high severity finding, please also link this issue in the #security channel in Slack.

@gortiz-dotcms gortiz-dotcms marked this pull request as ready for review May 6, 2026 16:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AI: Safe To Rollback Area : Backend PR changes Java/Maven backend code

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

Folder identifiers not updated on rename, leaving stale folder on receiver after push publish

1 participant