Tips for Handling Large High-Resolution Tiff Images with Multiple Channels in the arc-analysis Pipeline #1178

vyshakha · 2025-01-10T05:33:33Z

I am working with large Tiff images of very high resolution that contain multiple channels. Specifically, these images have a high number of channels (19+) and are quite large (around 60k) in terms of both pixel dimensions and file size.
I am encountering challenges processing these images efficiently within the arc-analysis pipeline, particularly in terms of memory usage, processing time, and potential issues with scaling or handling such large datasets.
Could anyone share tips or best practices for working with large, high-resolution Tiff images with many channels in the arc-analysis pipeline? Are there recommended techniques (besides creating multiple fovs) for optimizing memory usage, improving processing speed, or handling such data more effectively?

alex-l-kong · 2025-01-30T19:53:38Z

Hi @vyshakha how much memory do you have to work with on your machine?

Loading in an entire 60000x60000 image with 19 channels could take up over 50GB of memory assuming np.float64 representation (which our pipeline does do on a per-FOV basis). Even with 64 GB of memory, this could crash the pipeline if you had several other processes running in the background.

If you have access to an HPC with significantly more RAM, it may be easiest to run the pipeline there. You could try lowering the precision to np.float32 or np.float16 during preprocessing, but this will require more computational expertise as it will involve changing the underlying loading functionality and casting the subsetted training dataset back to np.float64 (I believe the SOM function explicitly requires this to maintain maximum precision when determining the weights).

You should also try to use an aggressive subset parameter (subset_proportion = 0.01 or even lower) so that the full training dataset can be loaded into memory.

vyshakha added the question Further information is requested label Jan 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tips for Handling Large High-Resolution Tiff Images with Multiple Channels in the arc-analysis Pipeline #1178

Tips for Handling Large High-Resolution Tiff Images with Multiple Channels in the arc-analysis Pipeline #1178

vyshakha commented Jan 10, 2025 •

edited

Loading

alex-l-kong commented Jan 30, 2025

Tips for Handling Large High-Resolution Tiff Images with Multiple Channels in the arc-analysis Pipeline #1178

Tips for Handling Large High-Resolution Tiff Images with Multiple Channels in the arc-analysis Pipeline #1178

Comments

vyshakha commented Jan 10, 2025 • edited Loading

alex-l-kong commented Jan 30, 2025

vyshakha commented Jan 10, 2025 •

edited

Loading