Don't initialize `mp_context` on import #6580

wence- · 2022-06-15T13:46:54Z

Tests added / passed
Passes pre-commit run --all-files

The fork and spawn methods offered by multiprocessing interact badly with many infiniband (or other high performance) interconnects which don't support fork after the relevant low-level networking library has been initialised (or else crash in strange ways). Multiprocessing's forkserver method offers a way round this. As long as the forkserver is started (via multiprocessing.forkserver.ensure_running()) before networking library initialisation then things work (the child process that does the forking doesn't need the network library).

While one can control the method that distributed uses by setting DASK_DISTRIBUTED__WORKER__MULTIPROCESSING_METHOD=forkserver in the environment, it would be nice to be able to control this programmatically as well (using the "distributed.worker.multiprocessing-method" key). This PR enables that by deferring the context creation to function call time (at which point the distributed config can be inspected, and might already have been modified) rather than module import time. Since multiprocessing contexts are singleton objects, this should not be a performance hit.

Query: dask also provides a get_context method in dask.multiprocessing that respects a different config option ("multiprocessing.context"). The only difference is that the distributed version of the function adds a bunch of modules to the preload list for the forkserver case. Should I refactor to use the dask.multiprocessing version and handle preload there?

Allows controlling the context type through dask.config programmatically, rather than needing to use environment variables, or carefully control import order.

GPUtester · 2022-06-15T13:46:56Z

Can one of the admins verify this patch?

fjetter · 2022-06-15T13:55:31Z

Thanks for your patch! Just skimming over the changes, this looks good to me. I am not sure if there was any deeper reason to having this setup at module import time. I guess having this cached should be sufficient.

I noticed you're having issues with our code linting (pre-commit hooks). See https://docs.dask.org/en/stable/develop.html#code-formatting for some guidance on how to set this up

Query: dask also provides a get_context method in dask.multiprocessing that respects a different config option ("multiprocessing.context"). The only difference is that the distributed version of the function adds a bunch of modules to the preload list for the forkserver case. Should I refactor to use the dask.multiprocessing version and handle preload there?

cc @jrbourbeau do you have any context about the dask.multiprocessing.get_context method?

wence- · 2022-06-15T14:00:50Z

I noticed you're having issues with our code linting (pre-commit hooks). See https://docs.dask.org/en/stable/develop.html#code-formatting for some guidance on how to set this up

Sorry, I'd run pre-commit install but some PEBKAC stopped it firing, fixed.

pentschev · 2022-06-15T16:06:54Z

add to allowlist

github-actions · 2022-06-15T16:11:53Z

Unit Test Results

See test report for an extended history of previous test failures. This is useful for diagnosing flaky tests.

      15 files ±0       15 suites ±0 6h 20m 21s ⏱️ - 46m 44s
  2 869 tests +1   2 786 ✔️ +29   80 💤 ±0 1 ❌ - 27 2 🔥 - 1
21 254 runs +9 20 310 ✔️ +38 941 💤 +2 1 ❌ - 30 2 🔥 - 1

For more details on these failures and errors, see this check.

Results for commit 34f1bf0. ± Comparison against base commit cb88e3b.

fjetter · 2022-06-16T09:10:37Z

Should I refactor to use the dask.multiprocessing version and handle preload there?

I had another look at it. It may be a good idea to refactor all of this but the two config options have a slightly different meaning.
The dask.multiprocessing.get_context function is intended for the dask process backend w/out a distributed scheduler, e.g. when running dask.compute(graph, scheduler='processes'). However, this context method is controlling how we start distributed cluster worker processes.
I think if we started to mix these parameters, this would cause breaking changes for users and I don't see a huge benefit right now.

I'm not sure if the preload is worth it in dask. This optimization was mostly added for test runtime as noted in this comment and most dask tests should be using a threaded backend. Even distributed is by now using spawn as a default method (#3374)

fjetter · 2022-06-16T09:17:02Z

Thank you @wence-

wence- · 2022-06-16T10:53:08Z

Thanks!

@pentschev

Allows selection of the method multiprocessing uses to start child processes. Additionally, in the forkserver case, ensure the fork server is up and running before any computation happens. Potentially fixes #930. Needs dask/distributed#6580. cc: @pentschev, @quasiben Authors: - Lawrence Mitchell (https://github.com/wence-) Approvers: - Peter Andreas Entschev (https://github.com/pentschev) URL: #933

Don't initialise multiprocessing context on import

18d342b

Allows controlling the context type through dask.config programmatically, rather than needing to use environment variables, or carefully control import order.

Test mp_context modification

34f1bf0

wence- force-pushed the wence/feature/mp-get-context-singleton branch from e1270fc to 34f1bf0 Compare June 15, 2022 14:00

This was referenced Jun 15, 2022

Add --multiprocessing-method option to benchmarks rapidsai/dask-cuda#933

Merged

trouble scaling cudf merge benchmark with ucx ~64 nodes on NERSC perlmutter rapidsai/dask-cuda#930

Closed

fjetter approved these changes Jun 16, 2022

View reviewed changes

fjetter merged commit 29dae02 into dask:main Jun 16, 2022

wence- deleted the wence/feature/mp-get-context-singleton branch June 16, 2022 10:53

pentschev mentioned this pull request Jul 25, 2022

No Nannies rapidsai/dask-cuda#834

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't initialize `mp_context` on import #6580

Don't initialize `mp_context` on import #6580

wence- commented Jun 15, 2022

GPUtester commented Jun 15, 2022

fjetter commented Jun 15, 2022

wence- commented Jun 15, 2022

pentschev commented Jun 15, 2022

github-actions bot commented Jun 15, 2022

fjetter commented Jun 16, 2022

fjetter commented Jun 16, 2022

wence- commented Jun 16, 2022

Don't initialize mp_context on import #6580

Don't initialize mp_context on import #6580

Conversation

wence- commented Jun 15, 2022

GPUtester commented Jun 15, 2022

fjetter commented Jun 15, 2022

wence- commented Jun 15, 2022

pentschev commented Jun 15, 2022

github-actions bot commented Jun 15, 2022

Unit Test Results

fjetter commented Jun 16, 2022

fjetter commented Jun 16, 2022

wence- commented Jun 16, 2022

Don't initialize `mp_context` on import #6580

Don't initialize `mp_context` on import #6580