Skip to content

⚡ Bolt: [performance improvement] optimize pathbuf allocations in tarjan SCC algorithm#188

Open
bashandbone wants to merge 1 commit intomainfrom
bolt/optimize-tarjan-pathbuf-allocs-3281374725584587497
Open

⚡ Bolt: [performance improvement] optimize pathbuf allocations in tarjan SCC algorithm#188
bashandbone wants to merge 1 commit intomainfrom
bolt/optimize-tarjan-pathbuf-allocs-3281374725584587497

Conversation

@bashandbone
Copy link
Copy Markdown
Contributor

@bashandbone bashandbone commented May 1, 2026

💡 What: Optimized to_path_buf() allocations in the tarjan_dfs graph traversal function for strongly connected components inside thread-flow's invalidation system.
🎯 Why: In TarjanState during graph traversal for finding cycles, calling .to_path_buf() on borrowed &Path values repeatedly within loops to query HashSets and HashMaps led to unnecessary O(E) heap allocations.
📊 Impact: Reduces memory allocation churn significantly during large invalidation graph cycle detections by changing lookup complexity from allocating to pure read.
🔬 Measurement: Verify by running cargo test -p thread-flow --test invalidation_tests ensuring no regressions, and note the avoidance of O(Edges) string allocations. Also clippy no longer emits explicit lifetime warnings in check_var.rs.


PR created automatically by Jules for task 3281374725584587497 started by @bashandbone

Summary by Sourcery

Optimize path-based lookups in the Tarjan SCC invalidation traversal to avoid redundant allocations and clean up related rule-engine lifetimes.

Enhancements:

  • Use borrowed path references instead of creating new PathBuf allocations for TarjanState map and set lookups in the invalidation graph traversal.
  • Simplify function signatures in the rule-engine variable checking helpers by removing unnecessary explicit lifetimes and using shared references directly.
  • Document the performance lesson about avoiding redundant to_path_buf and clone calls in Tarjan-style graph algorithms in the Bolt notes.

Documentation:

  • Extend the Bolt performance notes with guidance on avoiding redundant to_path_buf calls and unnecessary allocations in Tarjan SCC and graph traversals.

Eliminates O(E) unnecessary heap allocations during graph traversal
by utilizing borrowed `&Path` references for `HashMap` / `HashSet`
lookups (`get`, `get_mut`, `contains`) instead of explicitly creating
owned `PathBuf` via `v.to_path_buf()`.

Also resolves a few clippy explicit lifetime warnings in `check_var.rs`.

Co-authored-by: bashandbone <89049923+bashandbone@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 1, 2026 17:48
@google-labs-jules
Copy link
Copy Markdown
Contributor

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai Bot commented May 1, 2026

Reviewer's guide (collapsed on small PRs)

Reviewer's Guide

Optimizes Tarjan SCC traversal by eliminating repeated PathBuf allocations in map/set lookups and simplifies some rule-engine APIs by removing unnecessary lifetimes, plus updates the Bolt playbook documentation with the new performance lesson.

Class diagram for optimized TarjanState SCC path lookups

classDiagram
    class InvalidationDetector {
        +tarjan_dfs(v: &Path, state: &mut TarjanState)
    }

    class TarjanState {
        +indices: HashMap~PathBuf, usize~
        +lowlinks: HashMap~PathBuf, usize~
        +on_stack: HashSet~PathBuf~
        +stack: Vec~PathBuf~
        +get_index(v: &Path) usize
        +get_lowlink(v: &Path) usize
        +get_lowlink_mut(v: &Path) &mut usize
        +is_on_stack(v: &Path) bool
    }

    InvalidationDetector --> TarjanState : uses

    class Path {
    }

    class PathBuf {
    }

    TarjanState ..> Path : borrowed_lookup_keys
    TarjanState ..> PathBuf : owned_storage_keys
Loading

Class diagram for simplified check_var helper APIs

classDiagram
    class RuleEngineCheckVarModule {
        +get_vars_from_rules(rule: &Rule, utils: &RuleRegistration) RapidSetString
        +check_var_in_constraints(vars: RapidSetString, constraints: &RapidMapMetaVarRule) RResultRapidSetString
        +check_var_in_transform(vars: RapidSetString, transform: &OptionTransform) RResultRapidSetString
    }

    class Rule {
        +defined_vars() RapidSetString
        +used_vars() RapidSetString
    }

    class RuleRegistration {
    }

    class RapidSetString {
    }

    class RapidMapMetaVarRule {
    }

    class OptionTransform {
    }

    class Transform {
    }

    class RResultRapidSetString {
    }

    RuleEngineCheckVarModule --> Rule : reads
    RuleEngineCheckVarModule --> RuleRegistration : reads
    RuleEngineCheckVarModule --> RapidSetString : moves_ownership
    RuleEngineCheckVarModule --> RapidMapMetaVarRule : borrows_constraints
    RuleEngineCheckVarModule --> OptionTransform : borrows_transform
    OptionTransform --> Transform : wraps
    Rule --> RapidSetString : returns_sets
    RuleEngineCheckVarModule --> RResultRapidSetString : returns
Loading

Flow diagram for optimized PathBuf allocation in Tarjan SCC lookups

flowchart TD
    A["Start tarjan_dfs with v: &Path"] --> B["Lookup v in indices using &Path"]
    B --> C{"Index for v exists?"}
    C -->|"Yes"| D["Lookup v in lowlinks using &Path"]
    C -->|"No"| E["Allocate PathBuf from v and insert into indices and lowlinks"]
    E --> F["Push owned PathBuf onto stack and mark on_stack"]
    D --> G["Process dependencies using borrowed dep: &Path"]
    G --> H["Use dep directly in indices, lowlinks, on_stack lookups"]
    H --> I{"SCC root detected?"}
    I -->|"Yes"| J["Pop PathBuf nodes from stack to form SCC"]
    I -->|"No"| K["Continue DFS traversal"]
    J --> L["End traversal for v"]
    K --> L
Loading

File-Level Changes

Change Details Files
Avoid repeated PathBuf allocations in Tarjan SCC state lookups during invalidation graph traversal.
  • Replace HashMap/HashSet lookups that used v.to_path_buf() with lookups that use the borrowed &Path directly for indices and lowlinks.
  • Ensure lowlink and index retrieval and mutation operate on borrowed keys to avoid per-edge heap allocations while preserving algorithm semantics.
crates/flow/src/incremental/invalidation.rs
Simplify rule-engine function signatures by removing unnecessary explicit lifetimes from constraint and transform helpers.
  • Drop explicit lifetime parameter from check_var_in_constraints and accept a shared reference to RapidMap instead.
  • Drop explicit lifetime parameter from check_var_in_transform and accept a shared reference to Option instead, keeping behavior unchanged.
crates/rule-engine/src/check_var.rs
Extend Bolt performance notes with guidance on avoiding redundant PathBuf and clone allocations in graph algorithms.
  • Document the cost of repeated to_path_buf calls in TarjanState map/set operations and the benefit of using borrowed keys instead.
  • Add a general recommendation to avoid allocations when performing membership checks in RapidSet/RapidMap and only allocate on insertion/ownership transfer.
.jules/bolt.md

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've reviewed your changes and they look great!


Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Optimizes parts of the invalidation cycle-detection path (Tarjan SCC) by removing unnecessary PathBuf allocations during map/set lookups, and cleans up an unrelated clippy-driven lifetime signature issue.

Changes:

  • Use borrowed &Path keys directly for RapidMap<PathBuf, _> lookups inside tarjan_dfs, eliminating repeated to_path_buf() heap allocations on hot paths.
  • Remove redundant explicit lifetimes from helper functions in check_var.rs to satisfy clippy and simplify signatures.
  • Document the performance learning in .jules/bolt.md.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
crates/rule-engine/src/check_var.rs Simplifies helper signatures by removing unnecessary lifetime parameters.
crates/flow/src/incremental/invalidation.rs Avoids to_path_buf() allocations in Tarjan SCC lookups by using borrowed &Path keys.
.jules/bolt.md Adds a note describing the performance rationale and recommended practice.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants