Add scib-rapids to scverse ecosystem packages#344
Conversation
Add meta.yaml for scib-rapids package with details
|
Hi @maarten-devries, Thanks for submitting
It looks to me that the tests are disabled on push/PR and are only run manually, because the package requires a GPU to run these tests successfully. Would it be possible to use a Github or self-hosted GPU runner for the tests? Best, |
|
We made good experience with https://cirun.io/ for custom GPU runners. The problem is that you need to host them somewhere (in our case AWS) and pay for them. Running GPU tests would of course be ideal, but with jax, couldn't they at least be run on CPU? If there's enough interest, we could consider move the repo into the scverse org and then use our GPU runners. Or maybe there's interest in upstreaming this into scib-metris? Ping @ori-kron-wis |
|
Thanks for this package, looking forward to testing it. I have a few questions (also wrote in the package repo): Is there some performance benchmark information you can share? i.e comparing this implementation vs the scib-metrics (JAX-based) implementation on the same HW and data (CPU and GPU)? is there a specific version of jax/RAPIDS that either outperforms the other one, or is it true for all versions? is the benefit coming from the NN part? I would like to see a tutorial on that as well. Is there any aim to also provide spatial transcriptomics metrics here (seems RAPIDS can do a good job there using spatialdata)? Re: testing on GPUs, can you have a self-hosted local CUDA server to run them (this is what we do in scvi-tools)? |
|
Thanks for the reviews everyone! @mikkelnrasmussen Good point on the CI — I did indeed disable the tests on push/PR because they require a GPU. I can look into setting up a GPU runner (self-hosted or via cirun.io) to get those running automatically. @grst To clarify — scib-rapids does not use JAX. That's actually the whole raison d'être of the package :). @ori-kron-wis I will do some timing benchmarks and equivalency tests between scib-metrics and scib-rapids and provide them here as well. To set expectations: the goal is not necessarily that everything is faster (though that could happen as a side effect) — it's mostly to avoid the heavy JAX install dependency. This could be especially nice for people who already use rapids-singlecell anyway and don't want to pull in JAX on top of that. |
Thanks for clarifying. In retrospect, I don't get why I even had the idea that it were Jax-based. Maybe because I looked up scib-metrics in between. |
Benchmark: scib-rapids vs scib-metricsAs requested by @mikkelnrasmussen — here are head-to-head GPU benchmark results comparing scib-rapids (CuPy/cuML) against scib-metrics (JAX) on real scRNA-seq data. Setup:
EquivalencyAll deterministic metrics match to <0.03% relative difference. Two metrics show small expected diffs explained below.
† kbet_per_label (~0.5-0.9% diff): ‡ kmeans NMI/ARI (~1-4% diff): Both packages use Timing (seconds)
Key takeaways:
|
|
Thanks @maarten-devries - did you compare the jax[cuda] or jax[cpu] and which version? It wasn't stated. |
|
A couple of things:
|
|
Hi all, sorry for the late reply. I have some time again and will address your questions asap. |
|
Thanks everyone for the patience. I've cleaned up scib-benchmark with a proper README (methodology, raw metric values, timing tables for n=1k and n=20k, and notes on the small numerical diffs). Install commands there use @ori-kron-wis: yes, it's @ilan-gold — all four points addressed in scib-rapids:
GPU CI runner is still on my list (cirun.io or self-hosted) — tackling next. |
Name of the tool: scib-rapids
Short description: GPU-accelerated single-cell integration benchmarking metrics using RAPIDS (cuML, CuPy) as a drop-in replacement for the JAX-based metrics in scib-metrics.
How does the package use scverse data structures: scib-rapids takes
AnnDataobjects as input throughout its API, reading embeddings, neighbors graphs, and cluster labels from the standard AnnData slots (e.g.obsm,obsp,obs), consistent with the scverse ecosystem conventions.Mandatory
pip install scib-rapids)Recommended