You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am working with large Tiff images of very high resolution that contain multiple channels. Specifically, these images have a high number of channels (19+) and are quite large (around 60k) in terms of both pixel dimensions and file size.
I am encountering challenges processing these images efficiently within the arc-analysis pipeline, particularly in terms of memory usage, processing time, and potential issues with scaling or handling such large datasets.
Could anyone share tips or best practices for working with large, high-resolution Tiff images with many channels in the arc-analysis pipeline? Are there recommended techniques (besides creating multiple fovs) for optimizing memory usage, improving processing speed, or handling such data more effectively?
The text was updated successfully, but these errors were encountered:
Hi @vyshakha how much memory do you have to work with on your machine?
Loading in an entire 60000x60000 image with 19 channels could take up over 50GB of memory assuming np.float64 representation (which our pipeline does do on a per-FOV basis). Even with 64 GB of memory, this could crash the pipeline if you had several other processes running in the background.
If you have access to an HPC with significantly more RAM, it may be easiest to run the pipeline there. You could try lowering the precision to np.float32 or np.float16 during preprocessing, but this will require more computational expertise as it will involve changing the underlying loading functionality and casting the subsetted training dataset back to np.float64 (I believe the SOM function explicitly requires this to maintain maximum precision when determining the weights).
You should also try to use an aggressive subset parameter (subset_proportion = 0.01 or even lower) so that the full training dataset can be loaded into memory.
I am working with large Tiff images of very high resolution that contain multiple channels. Specifically, these images have a high number of channels (19+) and are quite large (around 60k) in terms of both pixel dimensions and file size.
I am encountering challenges processing these images efficiently within the arc-analysis pipeline, particularly in terms of memory usage, processing time, and potential issues with scaling or handling such large datasets.
Could anyone share tips or best practices for working with large, high-resolution Tiff images with many channels in the arc-analysis pipeline? Are there recommended techniques (besides creating multiple fovs) for optimizing memory usage, improving processing speed, or handling such data more effectively?
The text was updated successfully, but these errors were encountered: