Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support cuDF's built-in spilling #984

Merged
merged 8 commits into from
Nov 22, 2022

Conversation

madsbk
Copy link
Member

@madsbk madsbk commented Aug 31, 2022

Support of the new built-in spilling in cuDF so that device_memory_limit and memory_limit ignores cuDF's device buffers.

This is only implemented for DeviceHostFile. Since jit-unspill also targets cuDF and libraries such as cupy isn't supported, I don't think it is important to support cuDF's built-in spilling in ProxifyHostFile.

For now, DeviceHostFile simply ignores cuDF's device buffers and let cuDF handle the spilling. This means that DeviceHostFile might estimate the device and host memory usage incorrectly (or more incorrectly than usually).

@madsbk madsbk added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Aug 31, 2022
@github-actions github-actions bot added the python python code needed label Aug 31, 2022
@codecov-commenter
Copy link

codecov-commenter commented Aug 31, 2022

Codecov Report

❗ No coverage uploaded for pull request base (branch-22.12@f11abe3). Click here to learn what that means.
Patch has no changes to coverable lines.

Additional details and impacted files
@@              Coverage Diff               @@
##             branch-22.12    #984   +/-   ##
==============================================
  Coverage                ?   0.00%           
==============================================
  Files                   ?      18           
  Lines                   ?    2265           
  Branches                ?       0           
==============================================
  Hits                    ?       0           
  Misses                  ?    2265           
  Partials                ?       0           

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

DeviceHostFile: ignore spillable objects
@github-actions github-actions bot added conda conda issue gpuCI gpuCI issue labels Nov 21, 2022
@madsbk madsbk changed the base branch from branch-22.10 to branch-22.12 November 21, 2022 11:09
@madsbk madsbk changed the title Support cuDF's native spilling Support cuDF's built-in spilling Nov 21, 2022
@github-actions github-actions bot removed gpuCI gpuCI issue conda conda issue labels Nov 21, 2022
@madsbk madsbk marked this pull request as ready for review November 21, 2022 15:19
@madsbk madsbk requested a review from a team as a code owner November 21, 2022 15:19
Copy link
Member

@pentschev pentschev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good to me, thanks for working on this @madsbk . I've added a few comments and mostly aesthetic changes.

dask_cuda/proxify_host_file.py Outdated Show resolved Hide resolved
Returns:
- True if cudf's internal spilling is enabled, or
- False if it is disabled, or
- None if the current version of cudf doesn't support spilling, or
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does documenting "if the current version of cuDF doesn't support spilling" make sense? We're not backwards compatible, is there any case where this would occur in practice?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we don't depend on cudf, I guess people can have any version?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is theoretically right, but it isn't supported anyway. I guess it would make more sense for us to handle cuDF as an optional dependency and then we could define the minimum version and drop checks like this, but we can address this on a follow-up PR.

import versioneer
from setuptools import setup
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this just linting that added this, or did you intend to make this change for some reason?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was linting, don't we run CI linting on setup.py?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess not, it seems like only the dask_cuda directory is checked by CI. I constantly see PRs changing a small linting here and there, and is not exactly clear to me why that happens. How do you run linting, do you rely on pre-commit for that or do you run it manually? I expect pre-commit to only execute on files that actually changed when it runs as part of git commit.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used pre-commit run --all-files. Maybe we should just change them to check the whole repos?


pytest.importorskip(
"cudf.core.buffer.spill_manager",
reason="Current version of cudf doesn't support built-in spilling",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, will this ever happen in practice?

dask_cuda/tests/test_cudf_builtin_spilling.py Outdated Show resolved Hide resolved
dask_cuda/tests/test_cudf_builtin_spilling.py Show resolved Hide resolved
dask_cuda/tests/test_cudf_builtin_spilling.py Outdated Show resolved Hide resolved
dask_cuda/tests/test_cudf_builtin_spilling.py Show resolved Hide resolved
dask_cuda/tests/test_cudf_builtin_spilling.py Outdated Show resolved Hide resolved
dask_cuda/tests/test_cudf_builtin_spilling.py Outdated Show resolved Hide resolved
@madsbk
Copy link
Member Author

madsbk commented Nov 22, 2022

@pentschev, thanks for the review. It is ready for another round :)

Copy link
Member

@pentschev pentschev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @madsbk !

@pentschev
Copy link
Member

@gpucibot merge

@madsbk
Copy link
Member Author

madsbk commented Nov 22, 2022

Thanks @pentschev!

@pentschev
Copy link
Member

rerun tests

1 similar comment
@pentschev
Copy link
Member

rerun tests

@rapids-bot rapids-bot bot merged commit 6a94f23 into rapidsai:branch-22.12 Nov 22, 2022
@madsbk madsbk deleted the cudf_spilling branch November 23, 2022 07:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improvement / enhancement to an existing function non-breaking Non-breaking change python python code needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants