Switch to the new dask CLI #981

jacobtomlinson · 2022-08-24T10:45:06Z

In dask/dask#9283 we are adding a new top level dask CLI command which can be extended by other modules using entry points. A primary motivation here is to improve discoverability by uniting everything under one tool and allowing folks to run dask --help and dask <subcommand> --help to learn more about the various tools.

This PR adds a new click group called cuda and moves the dask-cuda-worker command under that group with the name worker. This means the dask-cuda-worker becomes dask cuda worker in the new CLI tool.

I haven't made any changes to the existing dask-cuda-worker console script so that will still continue to work, but maybe we should add a deprecation warning to it?

I went with this name rather than dask cuda-worker because I think it is more readable and also leaves us open to adding more subcommands in the future without cluttering up the top-level dask namespace.

$ dask --help             
Usage: dask [OPTIONS] COMMAND [ARGS]...

  Dask command line interface.

Options:
  --version   Show the version and exit.
  -h, --help  Show this message and exit.

Commands:
  cluster    Manage dask clusters.
  cuda       GPU subcommands.
  docs       Open Dask documentation (https://docs.dask.org/) in a web browser.
  info       Information about your dask installation.
  scheduler  Launch a distributed scheduler.
  ssh        Launch a distributed cluster over SSH.
  worker     Launch a distributed worker attached to an existing SCHEDULER.

$ dask cuda --help        
Usage: dask cuda [OPTIONS] COMMAND [ARGS]...

  GPU subcommands.

Options:
  -h, --help  Show this message and exit.

Commands:
  worker  Launch a distributed worker with GPUs attached to an existing SCHEDULER.

$ dask cuda worker --help  
Usage: dask cuda worker [OPTIONS] [SCHEDULER] [PRELOAD_ARGV]...

  Launch a distributed worker with GPUs attached to an existing SCHEDULER.

  See https://docs.rapids.ai/api/dask-cuda/stable/quickstart.html#dask-cuda-worker for
  info.

Options:
  --host TEXT                     IP address of serving host; should be visible to the
                                  scheduler and other workers. Can be a string (like
                                  ``"127.0.0.1"``) or ``None`` to fall back on the
                                  address of the interface specified by
                                  ``--interface`` or the default interface.
  --nthreads INTEGER              Number of threads to be used for each Dask worker
                                  process.  [default: 1]
...

The CLI PR needs to be merged and released before this can be merged.

Fixes #1038

codecov-commenter · 2022-08-24T11:13:26Z

Codecov Report

Base: 0.00% // Head: 87.93% // Increases project coverage by +87.93% 🎉

Coverage data is based on head (f91ff34) compared to base (8c87288).
Patch coverage: 87.28% of modified lines in pull request are covered.

Additional details and impacted files

@@                Coverage Diff                @@
##           branch-23.02     #981       +/-   ##
=================================================
+ Coverage          0.00%   87.93%   +87.93%     
=================================================
  Files                26       17        -9     
  Lines              3439     2295     -1144     
=================================================
+ Hits                  0     2018     +2018     
+ Misses             3439      277     -3162

Impacted Files	Coverage Δ
dask_cuda/initialize.py	`87.23% <ø> (+87.23%)`	⬆️
dask_cuda/cli.py	`87.37% <75.00%> (ø)`
dask_cuda/utils.py	`83.43% <82.35%> (+83.43%)`	⬆️
dask_cuda/explicit_comms/dataframe/shuffle.py	`98.65% <95.91%> (+98.65%)`	⬆️
dask_cuda/cuda_worker.py	`77.92% <100.00%> (+77.92%)`	⬆️
dask_cuda/explicit_comms/comms.py	`99.05% <100.00%> (+99.05%)`	⬆️
dask_cuda/get_device_memory_objects.py	`92.85% <100.00%> (+92.85%)`	⬆️
dask_cuda/local_cuda_cluster.py	`89.01% <100.00%> (+89.01%)`	⬆️
dask_cuda/proxify_host_file.py	`93.71% <100.00%> (+93.71%)`	⬆️
dask_cuda/benchmarks/common.py
... and 24 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

pentschev · 2022-08-24T14:24:30Z

The idea seems nice, but this is a very intricate component in many scripts and I imagine this will be annoying to abruptly change. Would we be able to keep dask-cuda-worker for some time with a DeprecationWarning added to it?

wence- · 2022-08-25T09:43:26Z

The idea seems nice, but this is a very intricate component in many scripts and I imagine this will be annoying to abruptly change. Would we be able to keep dask-cuda-worker for some time with a DeprecationWarning added to it?

AIUI, the old entry-point continues to exist and will work. I agree that a deprecation warning is a good idea, but not sure exactly how to hook that up.

pentschev · 2022-08-25T12:50:38Z

AIUI, the old entry-point continues to exist and will work.

Yes, you're right. That was my bad, I didn't pay a lot of attention to the actual changes.

I agree that a deprecation warning is a good idea, but not sure exactly how to hook that up.

I think we can simply check if dask_cuda_worker.py is running without the cuda argument and then raise the warning.

github-actions · 2022-09-24T13:13:50Z

This PR has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this PR if it is no longer required. Otherwise, please respond with a comment indicating any updates. This PR will be labeled inactive-90d if there is no activity in the next 60 days.

jacobtomlinson · 2022-09-26T11:01:52Z

This is still blocked on the upstream PR

jacobtomlinson · 2022-11-09T11:31:32Z

The new CLI has been merged and released so we should be good to merge this now.

I've bumped the minimum version of dask and distributed to one with the CLI although I'm not sure this is actually necessary as folks can still use dask-cuda-worker and the new dask cuda CLI namespace is opt-in.

pentschev

Overall LGTM, thanks @jacobtomlinson . I now wonder, could we have a dask cuda scheduler without increasing code duplication too much now? I think this could be useful to have some default overrides, more specifically DASK_DISTRIBUTED__COMM__UCX__CREATE_CUDA_CONTEXT=True.

requirements.txt

setup.py

jakirkham · 2022-11-11T05:13:13Z

It would be good to add the entry point to the Conda recipe as well (as is done in the feedstock).

conda/recipes/dask-cuda/meta.yaml

pentschev

Thanks @charlesbluca for picking this up, I've left a few suggestions/questions.

pentschev · 2023-01-13T19:02:42Z

dask_cuda/cli.py

+@click.option(
+    "--get-cluster-configuration",
+    "get_cluster_conf",
+    default=False,
+    is_flag=True,
+    required=False,
+    show_default=True,
+    help="""Print a table of the current cluster configuration""",
+)


I guess this is the only argument and without it nothing will happen. Does it make more sense to remove and make the default action (without any non-mandatory arguments) print the configuration?

Yeah was thinking that as well, just wasn't sure if it made more sense to do in a separate PR since the diff here is fairly large already.

(cc @quasiben in case there's something non-obvious this command does when --get-cluster-configuration isn't passed as an option)

dask_cuda/cli.py

ajschmidt8 · 2023-01-13T19:24:52Z

I canceled the latest GH Action run because it looks like the python tests were in-progress for nearly 4 hours, which is dramatically longer than the usual ~30 minutes that they take.

The logs are below, it looks like there were some errors throughout the tests.

https://github.com/rapidsai/dask-cuda/actions/runs/3912570589/jobs/6687484798

To prevent this from happening again, I will open a PR to put a reasonable timeout on the Python tests.

This also happened on #1081.

There were two instances recently (below) where some Python test errors caused the `conda-python-tests` job to run/hang for ~4 hours. - rapidsai#981 (comment) - rapidsai#1081 (comment) To prevent this from happening again in the future, I've added a reasonable timeout of 45 minutes to that particular job. The job usually takes ~25 minutes to complete, so 45 minutes should be plenty. This timeout will help prevent jobs from hanging and thus help preserve our finite GPU capacity for CI (particularly for `arm` nodes).

jakirkham · 2023-01-13T20:55:25Z

conda/recipes/dask-cuda/meta.yaml

+    {% for e in data.get("project", {}).get("scripts", {}).keys() %}
+    - {{ e }} --help
+    {% endfor %}
+    - dask cuda --help
+    - dask cuda worker --help
+    - dask cuda config --help


Actually we might be able to do this to consolidate them further

Suggested change

{% for e in data.get("project", {}).get("scripts", {}).keys() %}

- {{ e }} --help

{% endfor %}

- dask cuda --help

- dask cuda worker --help

- dask cuda config --help

- dask cuda --help

{% for e in data.get("project", {}).get("scripts", {}).keys() %}

- {{ e }} --help

- {{ e|replace("-", " ") }} --help

{% endfor %}

Good catch! Yeah for now we can do this, interested in if we intend to deprecate and eventually remove the old entrypoints, which would force us to hardcode this some time in the future (cc @jacobtomlinson if you've had any discussion around this)

Yeah that's fair. For that reason wasn't sure whether we would want to do this or not

There were two instances recently (below) where some Python test errors caused the `conda-python-tests` job to run/hang for ~4 hours. - #981 (comment) - #1081 (comment) To prevent this from happening again in the future, I've added a reasonable timeout of ~~45 minutes to that particular job~~ 30 minutes to the `pytest` command. The job usually takes ~25 minutes to complete entirely, so 30 minutes just for `pytest` should be plenty. This timeout will help prevent jobs from hanging and thus help preserve our finite GPU capacity for CI (particularly for `arm` nodes). Authors: - AJ Schmidt (https://github.com/ajschmidt8) Approvers: - Jake Awe (https://github.com/AyodeAwe)

Co-authored-by: jakirkham <[email protected]>

dask_cuda/tests/test_dask_cuda_worker.py

@charlesbluca

…port` (#1085) Changed this because IIUC `pkg_resources.working_set` is listing the installed distributions and not necessarily the importable modules; this becomes an issue if the distribution and module names aren't the same (e.g. one would `conda install pillow` and then `import PIL`), which was causing some failures in CI that seem unrelated to the changes here. _Originally posted by @charlesbluca in #981 (comment) Authors: - Charles Blackmon-Luca (https://github.com/charlesbluca) Approvers: - Peter Andreas Entschev (https://github.com/pentschev) URL: #1085

ajschmidt8

Approving ops-codeowner file changes

pentschev

LGTM, thanks @jacobtomlinson and @charlesbluca for working on this!

pentschev · 2023-01-19T09:25:47Z

/merge

jakirkham · 2023-01-19T10:05:48Z

Have noted the corresponding changes for the conda-forge recipe in issue ( conda-forge/dask-cuda-feedstock#17 ). Please feel free to note anything there I may have missed

@pentschev

…dask cuda config` (#1088) As @pentschev brought up in #981 (comment), we shouldn't need the `--get-cluster-configuration` option for `dask cuda config` since it only enables/disables printing the cluster configuration. Also added a check to ensure that a scheduler address or scheduler file has been specified, as otherwise IIUC running `dask cuda config` would just end up starting up and querying a local cluster on CPU. EDIT: Modified the scheduler check for `dask cuda worker` as well since it seems like a general improvement Authors: - Charles Blackmon-Luca (https://github.com/charlesbluca) Approvers: - Peter Andreas Entschev (https://github.com/pentschev) URL: #1088

Switch to the new dask CLI

a5a7bb4

github-actions bot added the python python code needed label Aug 24, 2022

jacobtomlinson added 5 - DO NOT MERGE Hold off on merging; see PR for details improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Aug 24, 2022

github-actions bot added the inactive-30d label Sep 24, 2022

github-actions bot removed the inactive-30d label Sep 26, 2022

raydouglass and others added 4 commits October 12, 2022 10:17

update changelog

62a1ee8

tweak name

bac33ba

Merge branch 'branch-22.10' into dask-cli

9a5c714

Bump miniu dask to a version with the new CLI

9440731

jacobtomlinson removed the 5 - DO NOT MERGE Hold off on merging; see PR for details label Nov 9, 2022

jacobtomlinson marked this pull request as ready for review November 9, 2022 11:30

jacobtomlinson requested a review from a team as a code owner November 9, 2022 11:30

jacobtomlinson requested a review from pentschev November 9, 2022 11:31

pentschev changed the base branch from branch-22.10 to branch-22.12 November 9, 2022 12:27

Merge branch 'branch-22.12' into dask-cli

77d3482

pentschev reviewed Nov 9, 2022

View reviewed changes

requirements.txt Outdated Show resolved Hide resolved

jakirkham reviewed Nov 11, 2022

View reviewed changes

setup.py Outdated Show resolved Hide resolved

jakirkham mentioned this pull request Nov 14, 2022

Rename dask-config CLI for clarity #1038

Closed

Merge remote-tracking branch 'origin/branch-23.02' into dask-cli

fd7dfe1

github-actions bot added the conda conda issue label Jan 13, 2023

charlesbluca reviewed Jan 13, 2023

View reviewed changes

conda/recipes/dask-cuda/meta.yaml Outdated Show resolved Hide resolved

pentschev reviewed Jan 13, 2023

View reviewed changes

ajschmidt8 mentioned this pull request Jan 13, 2023

shuffle-benchmark: add --partition-distribution #1081

Merged

charlesbluca added 2 commits January 13, 2023 11:26

Compute some of the test commands from pyproject

cbbe755

Add test for dask cuda group

49dba9a

ajschmidt8 mentioned this pull request Jan 13, 2023

Add timeout to pytest command #1082

Merged

charlesbluca added 2 commits January 13, 2023 11:49

Replace instances of old CLI with new CLI

72666a8

Revise CLI docstrings

3cd7115

jakirkham reviewed Jan 13, 2023

View reviewed changes

charlesbluca and others added 3 commits January 17, 2023 09:36

Consolidate recipe command tests

bec29a8

Co-authored-by: jakirkham <[email protected]>

Merge remote-tracking branch 'origin/branch-23.02' into dask-cli

80c0d9a

Resolve pre-import tests

f91ff34

charlesbluca reviewed Jan 17, 2023

View reviewed changes

dask_cuda/tests/test_dask_cuda_worker.py Show resolved Hide resolved

charlesbluca mentioned this pull request Jan 17, 2023

Use pkgutil.iter_modules to get un-imported module for test_pre_import #1085

Merged

ajschmidt8 approved these changes Jan 18, 2023

View reviewed changes

jakirkham approved these changes Jan 19, 2023

View reviewed changes

jakirkham requested a review from pentschev January 19, 2023 00:49

pentschev approved these changes Jan 19, 2023

View reviewed changes

rapids-bot bot merged commit cf179f1 into rapidsai:branch-23.02 Jan 19, 2023

jakirkham mentioned this pull request Jan 19, 2023

Update entry_points (in 23.02 release) conda-forge/dask-cuda-feedstock#17

Closed

charlesbluca mentioned this pull request Jan 19, 2023

Remove --get-cluster-configuration option, check for scheduler in dask cuda config #1088

Merged

jacobtomlinson deleted the dask-cli branch February 13, 2023 14:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Switch to the new dask CLI #981

Switch to the new dask CLI #981

jacobtomlinson commented Aug 24, 2022 •

edited by jakirkham

Loading

codecov-commenter commented Aug 24, 2022 •

edited

Loading

pentschev commented Aug 24, 2022

wence- commented Aug 25, 2022

pentschev commented Aug 25, 2022

github-actions bot commented Sep 24, 2022

jacobtomlinson commented Sep 26, 2022

jacobtomlinson commented Nov 9, 2022

pentschev left a comment

jakirkham commented Nov 11, 2022 •

edited

Loading

pentschev left a comment

pentschev Jan 13, 2023

charlesbluca Jan 13, 2023

ajschmidt8 commented Jan 13, 2023 •

edited

Loading

jakirkham Jan 13, 2023

charlesbluca Jan 17, 2023

jakirkham Jan 17, 2023

ajschmidt8 left a comment

pentschev left a comment

pentschev commented Jan 19, 2023

jakirkham commented Jan 19, 2023

Switch to the new dask CLI #981

Switch to the new dask CLI #981

Conversation

jacobtomlinson commented Aug 24, 2022 • edited by jakirkham Loading

codecov-commenter commented Aug 24, 2022 • edited Loading

Codecov Report

pentschev commented Aug 24, 2022

wence- commented Aug 25, 2022

pentschev commented Aug 25, 2022

github-actions bot commented Sep 24, 2022

jacobtomlinson commented Sep 26, 2022

jacobtomlinson commented Nov 9, 2022

pentschev left a comment

Choose a reason for hiding this comment

jakirkham commented Nov 11, 2022 • edited Loading

pentschev left a comment

Choose a reason for hiding this comment

pentschev Jan 13, 2023

Choose a reason for hiding this comment

charlesbluca Jan 13, 2023

Choose a reason for hiding this comment

ajschmidt8 commented Jan 13, 2023 • edited Loading

jakirkham Jan 13, 2023

Choose a reason for hiding this comment

charlesbluca Jan 17, 2023

Choose a reason for hiding this comment

jakirkham Jan 17, 2023

Choose a reason for hiding this comment

ajschmidt8 left a comment

Choose a reason for hiding this comment

pentschev left a comment

Choose a reason for hiding this comment

pentschev commented Jan 19, 2023

jakirkham commented Jan 19, 2023

jacobtomlinson commented Aug 24, 2022 •

edited by jakirkham

Loading

codecov-commenter commented Aug 24, 2022 •

edited

Loading

jakirkham commented Nov 11, 2022 •

edited

Loading

ajschmidt8 commented Jan 13, 2023 •

edited

Loading