Skip to content

proposal for regression testing#490

Open
HansVRP wants to merge 13 commits into
mainfrom
hv_regresion_benchmark
Open

proposal for regression testing#490
HansVRP wants to merge 13 commits into
mainfrom
hv_regresion_benchmark

Conversation

@HansVRP
Copy link
Copy Markdown
Contributor

@HansVRP HansVRP commented Apr 30, 2026

Idea for starting to include regression benchmarks.

@JanssenBrm I would also need info on how to best expose it such that we can keep a log on the service catalogue

@algorithm-services-catalogue
Copy link
Copy Markdown

algorithm-services-catalogue Bot commented Apr 30, 2026

🔍 Catalogue's Preview Site Deployed

Your changes have been deployed to the preview site:

🔗 Preview URL: https://esa-apex.github.io/apex-algorithms-catalogue-web/pr-preview/pr-490/

This preview will be updated automatically when you push new changes to your PR.

@VictorVerhaert VictorVerhaert self-assigned this May 4, 2026
@HansVRP HansVRP force-pushed the hv_regresion_benchmark branch 2 times, most recently from 44710b2 to 16c9681 Compare May 7, 2026 07:58
@HansVRP HansVRP force-pushed the hv_regresion_benchmark branch from 42a8863 to be6e8ed Compare May 13, 2026 07:17
@HansVRP
Copy link
Copy Markdown
Contributor Author

HansVRP commented May 13, 2026

@JanssenBrm @VictorVerhaert ready to check. I have opted for a more adaptive benchmark where we look at the average and the std. Depending on the nr of successful runs the benchmark becomes more determinantal

@HansVRP HansVRP requested a review from JeroenVerstraelen May 13, 2026 09:35
@HansVRP
Copy link
Copy Markdown
Contributor Author

HansVRP commented Jun 2, 2026

@JanssenBrm @JeroenVerstraelen @VictorVerhaert all feedback is welcome

Copy link
Copy Markdown
Contributor

@VictorVerhaert VictorVerhaert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two small optional comment aimed at trying to prevent false fails. One more question: could you try and run it using github actions and see how it behaves in practice?
Otherwise the pr looks clean

scaled_mad = 1.4826 * _median([abs(v - median) for v in values])

k = _adaptive_k(min(n, 10))
threshold = median + k * scaled_mad
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding an absolute buffer for the cost metric here. I wouldn't let a benchmark fail if it suddenly costs 9 instead of 8.

)


def load_scenario_history(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding a date cutoff field here which should equal the updated field in the record. That way when a benchmark gets updated it resets the performance tests history.
Not needed and could overcomplicate it, but otherwise we might get a lot of false fails.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants