Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RELEASE] dask-cuda v22.08 #969

Merged
merged 30 commits into from
Aug 17, 2022
Merged

[RELEASE] dask-cuda v22.08 #969

merged 30 commits into from
Aug 17, 2022

Conversation

GPUtester
Copy link
Contributor

❄️ Code freeze for branch-22.08 and v22.08 release

What does this mean?

Only critical/hotfix level issues should be merged into branch-22.08 until release (merging of this PR).

What is the purpose of this PR?

  • Update documentation
  • Allow testing for the new release
  • Enable a means to merge branch-22.08 into main for the release

raydouglass and others added 29 commits May 19, 2022 11:05
[gpuCI] Forward-merge branch-22.06 to branch-22.08 [skip gpuci]
[gpuCI] Forward-merge branch-22.06 to branch-22.08 [skip gpuci]
[gpuCI] Forward-merge branch-22.06 to branch-22.08 [skip gpuci]
[gpuCI] Forward-merge branch-22.06 to branch-22.08 [skip gpuci]
[gpuCI] Forward-merge branch-22.06 to branch-22.08 [skip gpuci]
[gpuCI] Forward-merge branch-22.06 to branch-22.08 [skip gpuci]
With changes from dask/distributed#6231 , the `loop` fixture now depends on the `cleanup` fixture, which must be imported explicitly.

Authors:
  - Peter Andreas Entschev (https://github.com/pentschev)

Approvers:
  - GALI PREM SAGAR (https://github.com/galipremsagar)
  - Mads R. B. Kristensen (https://github.com/madsbk)

URL: #924
With the PR below merged, we no longer set the `CXX`, `CC`, or `CUDAHOSTCXX` variables in any of our CI images. This PR cleans up some references to them.


- rapidsai/gpuci-build-environment#265

Authors:
  - AJ Schmidt (https://github.com/ajschmidt8)

Approvers:
  - Sevag Hanssian (https://github.com/sevagh)
  - Peter Andreas Entschev (https://github.com/pentschev)

URL: #929
With the `click` breakage in Distributed presumably resolved, we should be good removing this click pinning.

Closes #931

Authors:
  - Charles Blackmon-Luca (https://github.com/charlesbluca)

Approvers:
  - Peter Andreas Entschev (https://github.com/pentschev)

URL: #932
`loop` kwarg is now deprecated in `Nanny`, removing.

Authors:
  - Peter Andreas Entschev (https://github.com/pentschev)

Approvers:
  - Mads R. B. Kristensen (https://github.com/madsbk)

URL: #934
Changes to be in line with: rapidsai/cudf#11058

Authors:
  - GALI PREM SAGAR (https://github.com/galipremsagar)

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)
  - Peter Andreas Entschev (https://github.com/pentschev)

URL: #927
Needed for Sphinx 5 compatibility. Should fix the following warning occurring in doc builds.

```
Invalid configuration value found: 'language = None'. Update your configuration to a valid language code. Falling back to 'en' (English).
```

xref: sphinx-doc/sphinx#10481

Authors:
  - GALI PREM SAGAR (https://github.com/galipremsagar)

Approvers:
  - https://github.com/jakirkham

URL: #939
Two new utility functions are added to print benchmarks results to avoid the need to rearrange number of individual white spaces or separator lengths any time a new longer row is added, also preventing the count of spaces given indentation of code.

Authors:
  - Peter Andreas Entschev (https://github.com/pentschev)

Approvers:
  - Lawrence Mitchell (https://github.com/wence-)
  - Benjamin Zaitlen (https://github.com/quasiben)

URL: #937
The existing throughput would only give an overview of total data processed for the complete workflow, but no insight on communications which the new bandwidth value now does. Additionally, moved common peer-to-peer bandwidth computation as utility function.

Depends on #937

Authors:
  - Peter Andreas Entschev (https://github.com/pentschev)

Approvers:
  - Lawrence Mitchell (https://github.com/wence-)
  - Benjamin Zaitlen (https://github.com/quasiben)

URL: #938
Allows selection of the method multiprocessing uses to start child
processes. Additionally, in the forkserver case, ensure the fork
server is up and running before any computation happens.

Potentially fixes #930. Needs dask/distributed#6580.

cc: @pentschev, @quasiben

Authors:
  - Lawrence Mitchell (https://github.com/wence-)

Approvers:
  - Peter Andreas Entschev (https://github.com/pentschev)

URL: #933
…eduler-file` (#940)

This is a move towards using the benchmarks for regular profiling on more than one node.

That requires two substantive changes:

1. Refactor benchmark running and data processing to bring a client up from an external source (here a `--scheduler-file`, `dask-mpi` could be used I think, but I haven't done so).
2. As well as producing human-readable output, produce data that can be consumed by downstream scripts

I've refactored the benchmarks into common infrastructure, which simplifies new benchmark creation.

Authors:
  - Lawrence Mitchell (https://github.com/wence-)

Approvers:
  - Peter Andreas Entschev (https://github.com/pentschev)

URL: #940
Authors:
  - Mads R. B. Kristensen (https://github.com/madsbk)

Approvers:
  - Peter Andreas Entschev (https://github.com/pentschev)

URL: #941
Use a regular `dict` when creating a `LocalCUDACluster` with no host and no device memory limit.

Currently, setting `device_memory_limit=None` translate into the total available GPU memory.  However,  in some cases `DeviceHostFile` overestimate the GPU memory usage, which can trigger spilling even though `device_memory_limit=None`.

Authors:
  - Mads R. B. Kristensen (https://github.com/madsbk)

Approvers:
  - Peter Andreas Entschev (https://github.com/pentschev)

URL: #943
I realise I missed this property when doing some data analysis

Authors:
  - Lawrence Mitchell (https://github.com/wence-)

Approvers:
  - Peter Andreas Entschev (https://github.com/pentschev)

URL: #945
Make the`jit_unspill` property a local as discussed in #944. 

In addition, fix a bug that was introduced in `dask-cuda-worker` in #944: the `data` argument has to be a callable, and should presumably return an object that is unique for each nanny.

Authors:
  - Lawrence Mitchell (https://github.com/wence-)

Approvers:
  - Mads R. B. Kristensen (https://github.com/madsbk)
  - Peter Andreas Entschev (https://github.com/pentschev)

URL: #944
This introduces a test that was a TODO in #944

Authors:
  - Lawrence Mitchell (https://github.com/wence-)

Approvers:
  - Peter Andreas Entschev (https://github.com/pentschev)

URL: #946
The object was removed in rapidsai/cudf#11210.

Authors:
  - Peter Andreas Entschev (https://github.com/pentschev)
  - Mads R. B. Kristensen (https://github.com/madsbk)

Approvers:
  - Mads R. B. Kristensen (https://github.com/madsbk)

URL: #947
The `GIT_DESCRIBE_TAG` and `VERSION_SUFFIX` environment variables are used
to control the name and version of the created conda/pypi package.
They should, however, not be used to control the version of the
installed package by overriding the versioneer cmdclass since that
leaves an unmodified _version.py file in the installed package
directory. A consequence is that the version reported by
`dask_cuda.__version__` is `"0+unknown"`. 

We cannot always use the versioneer-provided cmdclass unmodified since
PEP440 specifically forbids PyPI from accepting packages that have local
version identifiers (as used by versioneer). To get around this, when setup.py
detects it is building a PyPI package (`GIT_DESCRIBE_TAG` is in the environment),
patch the version returned from versioneer with a PyPI-compatible one.

While we're here, bring the conda version string into line with the
rest of the rapids ecosystem.

Closes #336.

Authors:
  - Lawrence Mitchell (https://github.com/wence-)

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)
  - Peter Andreas Entschev (https://github.com/pentschev)

URL: #948
Fixes #959

If another object (like a `cupy.ndarray`) doesn't support `__matmul__` with `ProxyObject`, Python will try to fallback to `__rmatmul__`. As `__rmatmul__` was not defined before (and some cases like with CuPy now raise), this would raise. To fix, that this PR defines `__rmatmul__` for `ProxyObject`s to provide for this fallback layer.

Authors:
  - https://github.com/jakirkham

Approvers:
  - Peter Andreas Entschev (https://github.com/pentschev)

URL: #960
This PR fixes following errors in pytest with latest `distributed`:
```python
  @pytest.mark.parametrize("delayed", [True, False])
  def test_basic(loop, delayed):  # noqa: F811
file /nvme/0/pgali/envs/cudfdev/lib/python3.9/site-packages/distributed/utils_test.py, line 145
  @pytest.fixture
  def loop(loop_in_thread):
E       fixture 'loop_in_thread' not found
>       available fixtures: benchmark, benchmark_weave, cache, capfd, capfdbinary, caplog, capsys, capsysbinary, cleanup, current_cases, doctest_namespace, loop, monkeypatch, pytestconfig, record_property, record_testsuite_property, record_xml_attribute, recwarn, testrun_uid, tmp_path, tmp_path_factory, tmpdir, tmpdir_factory, worker_id
>       use 'pytest --fixtures [testpath]' for help on them.

/nvme/0/pgali/envs/cudfdev/lib/python3.9/site-packages/distributed/utils_test.py:145
```

Authors:
  - GALI PREM SAGAR (https://github.com/galipremsagar)

Approvers:
  - Lawrence Mitchell (https://github.com/wence-)
  - Peter Andreas Entschev (https://github.com/pentschev)

URL: #963
This PR pins dask & distributed to 2022.7.1 for 22.08 release.

xref: rapidsai/cudf#11433

Authors:
  - GALI PREM SAGAR (https://github.com/galipremsagar)

Approvers:
  - Peter Andreas Entschev (https://github.com/pentschev)
  - AJ Schmidt (https://github.com/ajschmidt8)

URL: #965
This PR switches docs to use the custom common `js` & `css` code merged here: rapidsai/docs#286

Authors:
  - GALI PREM SAGAR (https://github.com/galipremsagar)

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)

URL: #967
@GPUtester GPUtester requested a review from a team as a code owner August 5, 2022 14:45
@GPUtester GPUtester requested a review from a team as a code owner August 5, 2022 14:45
@codecov-commenter
Copy link

codecov-commenter commented Aug 5, 2022

Codecov Report

Merging #969 (9f46a1a) into main (d400ad1) will not change coverage.
The diff coverage is 0.00%.

@@          Coverage Diff          @@
##            main    #969   +/-   ##
=====================================
  Coverage   0.00%   0.00%           
=====================================
  Files         22      23    +1     
  Lines       3075    3099   +24     
=====================================
- Misses      3075    3099   +24     
Impacted Files Coverage Δ
dask_cuda/benchmarks/common.py 0.00% <0.00%> (ø)
dask_cuda/benchmarks/local_cudf_merge.py 0.00% <0.00%> (ø)
dask_cuda/benchmarks/local_cudf_shuffle.py 0.00% <0.00%> (ø)
dask_cuda/benchmarks/local_cupy.py 0.00% <0.00%> (ø)
dask_cuda/benchmarks/local_cupy_map_overlap.py 0.00% <0.00%> (ø)
dask_cuda/benchmarks/utils.py 0.00% <0.00%> (ø)
dask_cuda/cli/dask_cuda_worker.py 0.00% <0.00%> (ø)
dask_cuda/cuda_worker.py 0.00% <0.00%> (ø)
dask_cuda/get_device_memory_objects.py 0.00% <0.00%> (ø)
dask_cuda/local_cuda_cluster.py 0.00% <0.00%> (ø)
... and 2 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@github-actions github-actions bot added conda conda issue gpuCI gpuCI issue python python code needed labels Aug 5, 2022
@raydouglass raydouglass merged commit dab48ca into main Aug 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
conda conda issue gpuCI gpuCI issue python python code needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants