Optimizing compression ratio and read speeds for patch-based AI training #599

Karol-G · 2024-04-05T12:53:23Z

Karol-G
Apr 5, 2024

We are currently evaluating the very promising Blosc2 file format as the new default image format for nnU-Net, to supersede the currently used format of uncompressed Numpy files. The nnU-Net is capable of training on both 2D and 3D medical images with channels, so images are stored as 3D and 4D arrays. As medical images can be quite large (e.g., 3000³ voxels (3D) or 40,000² pixels (2D)), nnU-Net only loads random patches during training from the memory-mapped arrays.

The uncompressed Numpy arrays can be quite large on disk, so an alternative that achieves reasonable compression ratios, while maintaining similar read speeds to uncompressed Numpy files, would be interesting.

Preliminary evaluation results of Blosc2 have shown very good performance for both aspects in our setting. The next question is how to find good rules/heuristics to appropriately define the chunk and block size depending on the image and patch size.

As the patch locations are chosen completely at random and do not align with chunk or block boundaries, multiple blocks and chunks might need to be loaded depending on the patch, block, and chunk size. Furthermore, data loading happens in a multiprocessing environment with each worker (usually 12) loading a random patch from a random image. As a result, we set blosc2.set_nthreads(1) to avoid thread thrashing.

Our next step is to test different combinations of these parameters on multiple datasets to find some general (preferably simple) rules on how to automatically set the chunk and block size based on the image and patch size. Our first assumption would be to choose something like 1/2 or 1/4 of the patch size for the chunk size, while also not exceeding the L3 cache (see #598). Given the chunk size, Blosc2 could then automatically choose a block size that, I assume, would fit into the L2/L1 cache.

Do you have any additional empirical advice for us regarding this question, as you obviously have a much deeper understanding of Blosc2?

Kind regards,
Karol

Answered by FrancescAlted

Apr 5, 2024

As you know by now, Blosc2 has a lot knobs to tune, and as I like to say, there is no replacement for experimentation, so I agree on your plan :-). I can only add that we want users to don't worry much in this (sometimes) daunting task, so we are offering consulting services via ironArray SLU (https://ironarray.io) in case you need more expertise for evaluating different compression codecs/filters, chunk/block sizes among other things. Feel free to contact us; we are here to help!

View full answer

FrancescAlted · 2024-04-05T16:10:27Z

FrancescAlted
Apr 5, 2024
Maintainer

As you know by now, Blosc2 has a lot knobs to tune, and as I like to say, there is no replacement for experimentation, so I agree on your plan :-). I can only add that we want users to don't worry much in this (sometimes) daunting task, so we are offering consulting services via ironArray SLU (https://ironarray.io) in case you need more expertise for evaluating different compression codecs/filters, chunk/block sizes among other things. Feel free to contact us; we are here to help!

1 reply

Karol-G Apr 8, 2024
Author

I see. Thank you for your answer!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimizing compression ratio and read speeds for patch-based AI training #599

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

Optimizing compression ratio and read speeds for patch-based AI training #599

Karol-G Apr 5, 2024

Replies: 1 comment · 1 reply

FrancescAlted Apr 5, 2024 Maintainer

Karol-G Apr 8, 2024 Author

Karol-G
Apr 5, 2024

Replies: 1 comment 1 reply

FrancescAlted
Apr 5, 2024
Maintainer

Karol-G Apr 8, 2024
Author