FastSegmentator

Fast inference pipeline for nnU-Net and TotalSegmentator, the most popular frameworks for medical image segmentation. This project provides a clean, minimal inference module with only the necessary components, plus an end-to-end GPU fast-path: every stage — resampling (cucim + a GPU cubic-B-spline that matches scipy order=3 to ~1e-13), normalization, the sliding-window forward pass, logits→label conversion, cropping, and connected-component postprocessing — runs on the GPU. Across 24 parity-validated modes it reproduces official TotalSegmentator output at ≥0.999 DSC on the headline modes (≥0.995 on 21 of 24) while running 2–9× faster (forward-pass-bound; the rest is fixed import/model-load overhead amortized in batch).

Requirements

Python: 3.10
CUDA: 12.4
uv for environment management

Installation

Clone this repo. The vendored nnunetv2 lives in src/, and TotalSegmentator is expected as a sibling checkout (see [tool.uv.sources] in pyproject.toml):

git clone https://github.com/JunMa11/FastSegmentator.git
git clone https://github.com/wasserth/TotalSegmentator.git   # sibling of FastSegmentator
cd FastSegmentator

Create the environment and install everything (including the FastSegmentator command) in one step:

uv sync

uv sync builds the editable nnunetv2 package from src/, installs the pinned CUDA 12.1 torch wheels, cupy/cucim, and registers the FastSegmentator console script into .venv/.

Activate the environment so the FastSegmentator command is on your PATH:

source .venv/bin/activate

Data and Model Weights

Download the dataset and model weights from the Google Drive link.

Place the dataset in FastSegmentator/nnUNet_data/
Place the model weights in FastSegmentator/model_weights/

TotalSegmentator weights default to ~/.totalsegmentator/nnunet/results (override with --weights_dir).

Running Inference

With the environment activated, the FastSegmentator command dispatches to one of two backends:

FastSegmentator <command> [options]

Without activating, you can equivalently run uv run FastSegmentator ... or .venv/bin/FastSegmentator ....

`totalseg` — TotalSegmentator modes (config-driven)

FastSegmentator totalseg \
    -i <path_to_input_images> \
    -o <path_to_output_segmentations> \
    --task total

Flag	Default	Description
`-i`, `--input_path`	(required)	Folder of `*.nii.gz` input images
`-o`, `--output_path`	(required)	Folder to write multilabel output NIfTIs
`--task`	`total`	Mode (e.g. `total`, `total_mr`, `body_mr`, …)
`--weights_dir`	`~/.totalsegmentator/nnunet/results`	TotalSegmentator weights path
`--device`	`cuda`	Device (`cuda` or `cpu`)

Run FastSegmentator totalseg --help for the full list of --task modes.

`nnunet` — generic nnU-Net model folder

FastSegmentator nnunet \
    -i <path_to_input_images> \
    -o <path_to_output_segmentations> \
    --model_path <path_to_model_weights>

Flag	Default	Description
`-i`, `--input_path`	(required)	Path to the input image folder
`-o`, `--output_path`	(required)	Path to save output segmentations
`--model_path`	(required)	Path to the trained model directory
`--fold`	`all`	Fold to use for inference
`--checkpoint`	`checkpoint_final.pth`	Checkpoint filename
`--use_softmax`	`False`	Apply softmax to output probabilities
`--device`	`cuda`	Device (`cuda` or `cpu`)

Trainers. By design, the nnunet branch resolves only the standard nnU-Net trainers (nnUNetTrainer, nnUNetTrainerNoMirroring, nnUNetTrainerTopkLoss). To use a model trained with a custom trainer, point the nnUNet_extTrainer environment variable at the directory containing your trainer class so it can be resolved at checkpoint load:
export nnUNet_extTrainer=/path/to/your/trainers

Example

FastSegmentator nnunet \
    -i ./nnUNet_data/Dataset701_AbdomenCT/imagesVal \
    -o ./seg \
    --model_path ./model_weights/701/nnUNetTrainerMICCAI_repvgg__nnUNetPlans__3d_fullres

Parity with official TotalSegmentator

The fast-path is validated to match official TotalSegmentator on the same input (parity, not vs. ground truth) across 24 modes. Overview + interactive figures: report/index.html; full per-mode report: report/validation_report.html.

Of the 24 validated modes, 21 reach ≥0.995 DSC — every previously-failing pathology mode is now ≥0.999 — and 3 thin/sparse modes carry small, characterized caveats, all at 2–9× speedup:

Mode	Task	DSC vs official	Speedup
`total`	291–295	1.0000	9.6×
`liver_lesions`	591	1.0000	4.9×
`liver_lesions_mr`¹	589	1.0000	6.2×
`liver_segments_mr`	576	1.0000	6.7×
`trunk_cavities`	343	1.0000	4.1×
`lung_vessels`	117	0.9999	2.8×
`teeth`	113	0.9999	3.9×
`total_mr`	850,851	0.9999	9.5×
`lung_nodules`	913	0.9999	7.5×
`vertebrae_mr`	756	0.9999	4.6×
`lung_vessels_LEGACY`	258	0.9999	4.2×
`craniofacial_structures`	115	0.9998	4.0×
`body`	299	0.9996	2.3×
`pleural_pericard_effusion`	315	0.9990	9.3×
`head_muscles`	777	0.9989	5.1×
`head_glands_cavities`	775	0.9987	5.0×
`liver_segments`	570	0.9984	5.2×
`abdominal_muscles`	952	0.9981	4.6×
`headneck_bones_vessels`	776	0.9968	4.5×
`oculomotor_muscles`	351	0.9960	4.8×
`body_mr`	597	0.9956	4.9×
`headneck_muscles`	778,779	0.9945	4.9×
`kidney_cysts`	789	0.9919	4.9×
`liver_vessels`	8	0.9880	5.9×

¹ liver_lesions_mr — an 86-voxel lesion on the crop boundary, nondeterministic on both pipelines (official itself flips 86/52 voxels across runs); our deterministic output matches official's same-draw at DSC 1.0.

Three fixes brought the harder modes to parity (each isolated by bisecting against official's per-function intermediates):

GPU cubic-B-spline input resample — replaced order-1 trilinear (F.interpolate) with a separable order-3 cubic B-spline matching nnU-Net's skimage.resize(order=3) to ~1e-13. (pleural, lung_nodules)
dtype=np.int32 on the cucim input resample — matches official's pre-model int truncation. (total_mr, liver_segments_mr, liver crops)
Per-mode softmax→argmax convert for low-confidence lesion modes. (liver_lesions, liver_lesions_mr)

Plus GPU-ported crop + connected-component postprocess (bit-identical to the scipy originals) and cuDNN-deterministic forward for reproducibility.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
dynamic_network_architectures		dynamic_network_architectures
ext_trainers		ext_trainers
report		report
src		src
tests		tests
.gitignore		.gitignore
INFERENCE_GUIDE.md		INFERENCE_GUIDE.md
LICENSE		LICENSE
PLAN.md		PLAN.md
README.md		README.md
cli.py		cli.py
download_totalseg_weights.py		download_totalseg_weights.py
nnunet_infer_nii.py		nnunet_infer_nii.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
totalseg_ct_infer.py		totalseg_ct_infer.py
totalseg_infer.py		totalseg_infer.py
totalseg_modes.html		totalseg_modes.html
totalseg_mr_infer.py		totalseg_mr_infer.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

FastSegmentator

Requirements

Installation

Data and Model Weights

Running Inference

`totalseg` — TotalSegmentator modes (config-driven)

`nnunet` — generic nnU-Net model folder

Example

Parity with official TotalSegmentator

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

FastSegmentator

Requirements

Installation

Data and Model Weights

Running Inference

totalseg — TotalSegmentator modes (config-driven)

nnunet — generic nnU-Net model folder

Example

Parity with official TotalSegmentator

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`totalseg` — TotalSegmentator modes (config-driven)

`nnunet` — generic nnU-Net model folder

Packages