Use shared memory arrays for spawned multiprocessing #60

yoda-vid · 2021-07-02T01:58:58Z

Multiprocessing in MagellanMapper has assumed the forked start method, where objects present before the fork are available to each child process. Python on Windows only supports the spawn start method, however, where objects need to be shared through other means such as pickling or shared memory arrays. PR #59 introduced support for Windows in the labels erosion to markers task through pickling, but pickling can be slow enough to obviate the performance improvements of multiprocessing in other tasks such as converting labels to edges.

This PR uses shared memory arrays instead of pickling to share large objects. A new class handles conversion to and from the shared array for NumPy arrays, which the classes used for multiprocessing can extend to convert their arrays if not present through forking. Lateral extension now uses this shared instead of pickling. Performance is similar, however, presumably because the multiprocessing needs to be restarted multiple times, once for each plane. The same function used during the merge_atlas_stats erosion is at least slightly faster when using shared arrays, presumably because multiprocessing and shared array setup is only done once for the full volume.

Using this class, support for spawned multiprocessing (ie Windows, #58) is added to the vol_stats task, and the fewerstats atlas profile no longer needs to be used for the make_edge_images and merge_atlas_stats tasks.

- The index has not been written but now can toggled by a parameter - Convert docs to type hints for the function converting dictionaries to a data frame

The merge CSVs task uses the prefix CLI argument to identify the output filename and has not saved any file if no prefix is given. Since the main role of the task is to merge multiple input CSV files into a single output file, provide a default output filename if no prefix is given.

Add parameters to accept initializer function and associated arguments when setting up a multiprocessing Pool.

Using a shared multiprocessing array allows the memory for the labels image to be shared rather than duplicating each label and pickling them. Accessing the shared array still incurs a significant performance penalty, however, roughly comparable to that of pickling in our benchmarks from atlas import, with increased gains with fewer initializations such as in the edge-aware reannotation task. `np.frombuffer` is used to reconstruct the shared array since other methods such as `np.asarray` and `np.ctypeslib.as_array` took up to 250% as long.

The stack splitter reduces chunk sizes if the combined max pixels and overlap sizes exceed any dimension, such as the last chunk in a given direction or when the combined max and overlap pixels is larger than the stack itself. The split stack merger has not accounted for this potential truncation, however, fixed here. This fix still assumes that the stack was at least as large as this max pixels parameter.

Refactor into unit testing and test multiple overlap sizes.

Move this function to avoid importing detector in chunking module, reducing the potential for circular imports.

Refactor generating and converting shared memory arrays for multiprocessing of labels images into a separate class. Also, generalize the initialization function.

Use a shared memory array for the labels image used to generate edges through multiprocessing in spawned mode.

Simplify extracting a label region by using the `cv_nd.extract_region` function. Also, warn if no region is found.

The shared image array container was designed for one array, but some tasks require multiple arrays. Generalize this class to take a dictionary of ndarrays, store them in a separate class for shared array instances, and convert shared arrays back to ndarrays when accessed by key.

Use the shared memory array class for the multiple images and blobs arrays used to measure volume stats through multiprocessing in spawned mode.

The function to generate density images accesses a scaling function that in turn accesses a global atlas profile, which is not available during spawned multiprocessing. Refactor this atlas profile access so that the profile can be passed directly to the density function. Additionally: - Fix stale references in the density function docstring - Clean up function to get transposed image path by removing unnecessary variable initialization and converting doc types to type hints

Lateral extension uses labels erosion serially, starting multiprocessing for each plane. This repeated starting can counteract the performance improvement of multiprocessing, especially in spawned mode, which induces considerable overhead for shared array setup. Provide an option to turn off multiprocessing during labels erosion and use it for lateral extension.

- Add release notes - Fix docstring indentation for measuring label metrics

yoda-vid added 12 commits July 1, 2021 10:59

Add parameter to toggle writing the index to CSV from data frames

2ba21de

- The index has not been written but now can toggled by a parameter - Convert docs to type hints for the function converting dictionaries to a data frame

Accept initializer arguments for multiprocessing Pool wrapper

c6cf0e2

Add parameters to accept initializer function and associated arguments when setting up a multiprocessing Pool.

Move chunking tests into separate unit test module

6155ea0

Refactor into unit testing and test multiple overlap sizes.

Refactor calc_overlap function into detector module

92f4f34

Move this function to avoid importing detector in chunking module, reducing the potential for circular imports.

Generalize multiprocessing shared array for labels images

fdc9b14

Refactor generating and converting shared memory arrays for multiprocessing of labels images into a separate class. Also, generalize the initialization function.

Support labels to edges in spawned multiprocessing

b3bdc45

Use a shared memory array for the labels image used to generate edges through multiprocessing in spawned mode.

Refactor finding label edge to use region extractor

f3146b1

Simplify extracting a label region by using the `cv_nd.extract_region` function. Also, warn if no region is found.

Support volume stats in spawned multiprocessing

01ffbaa

Use the shared memory array class for the multiple images and blobs arrays used to measure volume stats through multiprocessing in spawned mode.

yoda-vid added the enhancement New feature or request label Jul 2, 2021

yoda-vid mentioned this pull request Jul 2, 2021

Support more multiprocessed functions in Windows #58

Closed

5 tasks

yoda-vid added 3 commits July 2, 2021 12:37

Changelog entry for spawned multiprocessing support

e05c35f

- Add release notes - Fix docstring indentation for measuring label metrics

yoda-vid added this to the v1.4.1 milestone Jul 2, 2021

yoda-vid merged commit eb5bccc into master Jul 2, 2021

yoda-vid deleted the mp_shared_array branch July 2, 2021 23:40

yoda-vid mentioned this pull request Jul 16, 2021

Export image stacks to multiple figures, support Windows #68

Merged

yoda-vid modified the milestones: v1.4.1, v1.5.0 Nov 4, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use shared memory arrays for spawned multiprocessing #60

Use shared memory arrays for spawned multiprocessing #60

yoda-vid commented Jul 2, 2021

Use shared memory arrays for spawned multiprocessing #60

Use shared memory arrays for spawned multiprocessing #60

Conversation

yoda-vid commented Jul 2, 2021