Fix determinism in augmentation by Luugaaa · Pull Request #13 · MIC-DKFZ/batchgeneratorsv2

Luugaaa · 2025-07-18T17:39:37Z

Hi MIC-DKFZ team,

This PR fixes an issue that was preventing fully deterministic behavior in the augmentation pipeline, even when a seed was provided. I noticed this behavior when trying to train a nnunet model deterministically, I'll submit a PR for the nnunet side too, hoping that can help !

Problem

The main issue was that several transforms used PyTorch's random functions (torch.rand, torch.normal). In a multi-process environment like MultiThreadedAugmenter (like in nnunet), the numpy RNG is correctly seeded in each worker, but the PyTorch RNG is not. This led to unpredictable results from any transform using torch for randomness.

I also found a couple of related issues: the benchmark=True parameter in GaussianBlurTransform is inherently non-deterministic, and SpatialTransform had some unstable randomness from mixing torch and numpy operations.

Solution

The fix was to go through the library and make sure all random operations rely on NumPy's random generator. This way, everything is controlled by the single RNG that's properly seeded in the data loader's workers.

Here are the transforms that were updated:

RandomTransform
SpatialTransform (for elastic deform)
MirrorTransform
SimulateLowResolutionTransform
GaussianBlurTransform
GaussianNoiseTransform
MultiplicativeBrightnessTransform
ContrastTransform
GammaTransform
RicianNoiseTransform
InvertImageTransform
RemoveRandomConnectedComponentFromOneHotEncodingTransform
ApplyRandomBinaryOperatorTransform

How It's Tested

To make sure these fixes work and to catch any future regressions, I've added a new testing script, determinism_test_pipeline.py.

The script tests every transform in the library for both 2D and 3D data. The key part of the test is how it checks for determinism: it runs each transform twice, but for the second run, it only re-seeds the NumPy RNG. This aims to mimic the multi-worker environment. The script demonstrates the non determinism on the original code.

With these changes, the whole library now passes this test, so I believe we can be confident that augmentation pipelines are fully reproducible. This change should allow for fully reproducible training pipelines, which is a big deal for research. The performance impact should be minimal, and might even be a little better since some inefficient operations and benchmarking overhead were removed.

Thanks for maintaining this great library. Hope this helps, and let me know what you think!

Post Scriptum

Here are the results obtained when running the test pipeline with the original transformations :

--- Starting Determinism Test Pipeline ---

--- Part 1: Testing all transforms on 3D data ---

🧪 Testing MultiplicativeBrightnessTransform (3D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing BrightnessAdditiveTransform (3D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing ContrastTransform (3D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing GammaTransform (3D)...
    - ❌ FAIL: Tensor mismatch for key 'image'. Max difference: 0.7208588123321533

🧪 Testing GaussianNoiseTransform (3D)...
    - ❌ FAIL: Tensor mismatch for key 'image'. Max difference: 0.19622468948364258

🧪 Testing InvertImageTransform (3D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing CutOffOutliersTransform (3D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing BrightnessGradientAdditiveTransform (3D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing LocalContrastTransform (3D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing LocalGammaTransform (3D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing LocalSmoothingTransform (3D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing ApplyRandomBinaryOperatorTransform (3D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing RemoveRandomConnectedComponentFromOneHotEncodingTransform (3D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing BlankRectangleTransform (3D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing GaussianBlurTransform (3D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing MedianFilterTransform (3D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing RicianNoiseTransform (3D)...
    - ❌ FAIL: Tensor mismatch for key 'image'. Max difference: 0.02650284767150879

🧪 Testing SharpeningTransform (3D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing SimulateLowResolutionTransform (3D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing MirrorTransform (3D)...
    - ❌ FAIL: Tensor mismatch for key 'image'. Max difference: 4.999321937561035

🧪 Testing Rot90Transform (3D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing SpatialTransform (3D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing TransposeAxesTransform (3D)...
    - ✅ [PASS] Outputs are identical.


--- Part 2: Testing all transforms on 2D data ---

🧪 Testing MultiplicativeBrightnessTransform (2D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing BrightnessAdditiveTransform (2D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing ContrastTransform (2D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing GammaTransform (2D)...
    - ❌ FAIL: Tensor mismatch for key 'image'. Max difference: 0.7208588123321533

🧪 Testing GaussianNoiseTransform (2D)...
    - ❌ FAIL: Tensor mismatch for key 'image'. Max difference: 0.19330227375030518

🧪 Testing InvertImageTransform (2D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing CutOffOutliersTransform (2D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing BrightnessGradientAdditiveTransform (2D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing LocalContrastTransform (2D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing LocalGammaTransform (2D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing LocalSmoothingTransform (2D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing ApplyRandomBinaryOperatorTransform (2D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing RemoveRandomConnectedComponentFromOneHotEncodingTransform (2D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing BlankRectangleTransform (2D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing GaussianBlurTransform (2D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing MedianFilterTransform (2D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing RicianNoiseTransform (2D)...
    - ❌ FAIL: Tensor mismatch for key 'image'. Max difference: 0.025466203689575195

🧪 Testing SharpeningTransform (2D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing SimulateLowResolutionTransform (2D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing MirrorTransform (2D)...
    - ❌ FAIL: Tensor mismatch for key 'image'. Max difference: 5.283004283905029

🧪 Testing Rot90Transform (2D)...
    - ✅ [PASS] Outputs are identical.

🧪 Skipping SpatialTransform (3D-only)...

🧪 Testing TransposeAxesTransform (2D)...
    - ✅ [PASS] Outputs are identical.


--- Part 3: Testing composed pipeline on sample_image.jpg (fixed transforms only) ---
✅ Loaded, resized, and saved 'sample_image.jpg' as 'sample_image_original.png'.
✅ Created a composed pipeline with 11 fixed transforms.
✅ Saved 'sample_image_augmented.png'.

--- Test Pipeline Finished ---
Total Checks: 45 | ✅ Passed: 37 | ❌ Failed: 8
--------------------------------

🔥 Some augmentations failed the determinism check. Please review the logs above.

Original image :

Augmented image :

And here are the results obtained when running the test pipeline with the fixed transformations :

--- Starting Determinism Test Pipeline ---

--- Part 1: Testing all transforms on 3D data ---

🧪 Testing MultiplicativeBrightnessTransform (3D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing BrightnessAdditiveTransform (3D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing ContrastTransform (3D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing GammaTransform (3D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing GaussianNoiseTransform (3D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing InvertImageTransform (3D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing CutOffOutliersTransform (3D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing BrightnessGradientAdditiveTransform (3D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing LocalContrastTransform (3D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing LocalGammaTransform (3D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing LocalSmoothingTransform (3D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing ApplyRandomBinaryOperatorTransform (3D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing RemoveRandomConnectedComponentFromOneHotEncodingTransform (3D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing BlankRectangleTransform (3D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing GaussianBlurTransform (3D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing MedianFilterTransform (3D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing RicianNoiseTransform (3D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing SharpeningTransform (3D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing SimulateLowResolutionTransform (3D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing MirrorTransform (3D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing Rot90Transform (3D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing SpatialTransform (3D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing TransposeAxesTransform (3D)...
    - ✅ [PASS] Outputs are identical.


--- Part 2: Testing all transforms on 2D data ---

🧪 Testing MultiplicativeBrightnessTransform (2D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing BrightnessAdditiveTransform (2D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing ContrastTransform (2D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing GammaTransform (2D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing GaussianNoiseTransform (2D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing InvertImageTransform (2D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing CutOffOutliersTransform (2D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing BrightnessGradientAdditiveTransform (2D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing LocalContrastTransform (2D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing LocalGammaTransform (2D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing LocalSmoothingTransform (2D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing ApplyRandomBinaryOperatorTransform (2D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing RemoveRandomConnectedComponentFromOneHotEncodingTransform (2D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing BlankRectangleTransform (2D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing GaussianBlurTransform (2D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing MedianFilterTransform (2D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing RicianNoiseTransform (2D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing SharpeningTransform (2D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing SimulateLowResolutionTransform (2D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing MirrorTransform (2D)...
    - ✅ [PASS] Outputs are identical.

🧪 Testing Rot90Transform (2D)...
    - ✅ [PASS] Outputs are identical.

🧪 Skipping SpatialTransform (3D-only)...

🧪 Testing TransposeAxesTransform (2D)...
    - ✅ [PASS] Outputs are identical.


--- Part 3: Testing composed pipeline on sample_image.jpg (fixed transforms only) ---
✅ Loaded, resized, and saved 'sample_image.jpg' as 'sample_image_original.png'.
✅ Created a composed pipeline with 11 fixed transforms.
✅ Saved 'sample_image_augmented.png'.

--- Test Pipeline Finished ---
Total Checks: 45 | ✅ Passed: 45 | ❌ Failed: 0
--------------------------------

🎉 All augmentation tests passed and are deterministic!

Augmented image :

Luugaaa · 2025-07-18T21:05:52Z

The related PR in nnunet is here :)

FabianIsensee · 2026-03-09T15:11:53Z

Hey, thanks for this work! One question from my side: This PR is based on the assumption that the torch RNG is not properly seeded in a multiprocessing environment. Why is it not possible to just seed all used RNGs to obtain deterministic augmentations. I would like to avoid bouncing back and forth between torch and numpy all the time. The few times that we are doing this already bug be quite a lot.

sifaoso · 2026-06-05T14:53:02Z

Hey, thanks for this work! One question from my side: This PR is based on the assumption that the torch RNG is not properly seeded in a multiprocessing environment. Why is it not possible to just seed all used RNGs to obtain deterministic augmentations. I would like to avoid bouncing back and forth between torch and numpy all the time. The few times that we are doing this already bug be quite a lot.

Hey, following the comment of @FabianIsensee - I've opened a new PR in batchgenerators that sets all used RNGs in the multiprocessing environment of MultiThreadedAugmenter in order to make augmentations deterministic when seeds is set.

Luugaaa added 4 commits July 18, 2025 10:46

deterministic fix

1525347

missing np

61ee047

determinism pipeline testing

eb14c61

note on benchmark

df02edb

This was referenced Jul 18, 2025

Determinism implementation in nnunet MIC-DKFZ/nnUNet#2871

Open

Ablation study, reproducibility and determinism ivadomed/model_seg_sc-gm-lesion_human_ms_exvivo_t2star#23

Open

sifaoso mentioned this pull request Jun 5, 2026

[FIX] deterministic augmentations in MTA by setting seed of all used RNGs in the MTA multiprocessing environment MIC-DKFZ/batchgenerators#139

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix determinism in augmentation#13

Fix determinism in augmentation#13
Luugaaa wants to merge 4 commits into
MIC-DKFZ:masterfrom
Luugaaa:fix_determinism_in_augmentation

Luugaaa commented Jul 18, 2025

Uh oh!

Luugaaa commented Jul 18, 2025

Uh oh!

FabianIsensee commented Mar 9, 2026

Uh oh!

sifaoso commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Luugaaa commented Jul 18, 2025

Problem

Solution

How It's Tested

Post Scriptum

Uh oh!

Luugaaa commented Jul 18, 2025

Uh oh!

FabianIsensee commented Mar 9, 2026

Uh oh!

sifaoso commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants