Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Require explicit pool size in pool_memory_resource and move some things out of detail namespace #1417

Merged

Conversation

harrism
Copy link
Member

@harrism harrism commented Dec 20, 2023

Description

Fixes #1416.

  • Deprecates existing ctors of pool_memory_resource that provide optional parameter for the initial pool size.
  • Adds new ctors that require an explicit initial pool size.
  • We don't yet deprecate anything in this PR because that would break builds of some RAPIDS libraries. We will follow up with PRs to cuDF, cuGraph and anything else needed to remove deprecated usages after this PR is merged.
  • Adds a new utility fraction_of_available_device_memory that calculates the specified fraction of free memory on the current CUDA device. This is now used in tests to provide an explicit pool size and can be used to produce the previous behavior of pool_memory_resource for consumers of the library.
  • Moves available_device_memory from a detail header to cuda_device.hpp so it is now publicly usable, along with the above utility.
  • Temporarily adds detail::available_device_memory as an alias of the above in order to keep cudf and cugraph building until we can update them.
  • Duplicates commonly externally used alignment functions that are currently in rmm::detail to the public rmm namespace. The detail versions will be removed after cuDF and cuGraph are updated to not use them.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@harrism harrism requested a review from a team as a code owner December 20, 2023 01:18
@harrism harrism requested review from wence- and jrhemstad December 20, 2023 01:18
@github-actions github-actions bot added the cpp Pertains to C++ code label Dec 20, 2023
@harrism harrism added breaking Breaking change improvement Improvement / enhancement to an existing function and removed cpp Pertains to C++ code labels Dec 20, 2023
@harrism harrism self-assigned this Dec 20, 2023
@harrism harrism added the 5 - DO NOT MERGE Hold off on merging; see PR for details label Dec 20, 2023
@github-actions github-actions bot added the cpp Pertains to C++ code label Dec 20, 2023
@harrism
Copy link
Member Author

harrism commented Dec 20, 2023

Discovering the cudf and cugraph depend on rmm::detail::available_device_memory so should probably add an alias for that rather than just moving it to rmm::available_device_memory. Then once the dependent libs are updated we can remove the detail version.

Copy link

copy-pr-bot bot commented Jan 9, 2024

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@harrism harrism removed the 5 - DO NOT MERGE Hold off on merging; see PR for details label Jan 9, 2024
@harrism
Copy link
Member Author

harrism commented Jan 9, 2024

/ok to test

@harrism
Copy link
Member Author

harrism commented Jan 9, 2024

/ok to test

Copy link
Contributor

@miscco miscco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look good to me.

The one thing that I find interesting is that this removes usage of get_upstream()->get_mem_info(cuda_stream_legacy)

I am wondering whether we should deprecate that facility too

@harrism
Copy link
Member Author

harrism commented Jan 9, 2024

Yes we should. :) See #1388 and #1389.

@harrism
Copy link
Member Author

harrism commented Jan 11, 2024

/ok to test

Note that cross-linking to top-level namespace global variables or
functions (as in rmm/aligned.hpp) does not work without including a
namespace directive. But, I can't figure out how to make breathe play
nicely with that, so let's just do this for now.
@wence-
Copy link
Contributor

wence- commented Jan 11, 2024

/ok to test

When resolving the xref we must resolve relative to the _current_
document, not the document of the target we are trying to link
to.
@wence-
Copy link
Contributor

wence- commented Jan 11, 2024

/ok to test

@wence-
Copy link
Contributor

wence- commented Jan 11, 2024

Figured out the problem with the cross-linking to objects defined in the utilities group (cc @vyars to check the conf.py changes), so now the docs build with correct linking.

So from my point of view this is definitely good to go.

@vyasr
Copy link
Contributor

vyasr commented Jan 11, 2024

The doc fixes look correct to me, thanks @wence-!

@harrism
Copy link
Member Author

harrism commented Jan 15, 2024

/merge

@rapids-bot rapids-bot bot merged commit 64aa941 into rapidsai:branch-24.02 Jan 15, 2024
48 checks passed
rapids-bot bot pushed a commit to rapidsai/cuspatial that referenced this pull request Jan 16, 2024
This PR fixes up cuSpatial to avoid usage that will soon be deprecated in RMM.

Depends on rapidsai/rmm#1417

Fixes #1318

Authors:
  - Mark Harris (https://github.com/harrism)

Approvers:
  - Michael Wang (https://github.com/isVoid)

URL: #1319
rapids-bot bot pushed a commit to rapidsai/raft that referenced this pull request Jan 17, 2024
…ludes (#2088)

This PR fixes up RAFT to avoid usage that will soon be deprecated in RMM.

Depends on rapidsai/rmm#1417

Fixes #2087

Authors:
  - Mark Harris (https://github.com/harrism)
  - Corey J. Nolet (https://github.com/cjnolet)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #2088
rapids-bot bot pushed a commit to rapidsai/cudf that referenced this pull request Jan 17, 2024
This PR fixes up cuDF to avoid usage that will soon be deprecated in RMM.

Depends on rapidsai/rmm#1417

Fixes #14658

Authors:
  - Mark Harris (https://github.com/harrism)
  - Yunsong Wang (https://github.com/PointKernel)
  - Nghia Truong (https://github.com/ttnghia)

Approvers:
  - Nghia Truong (https://github.com/ttnghia)
  - Yunsong Wang (https://github.com/PointKernel)

URL: #14741
rapids-bot bot pushed a commit to rapidsai/cugraph that referenced this pull request Jan 17, 2024
This PR fixes up cuGraph to avoid usage that will soon be deprecated in RMM.

Depends on rapidsai/rmm#1417

Fixes #4066

Authors:
  - Mark Harris (https://github.com/harrism)

Approvers:
  - Chuck Hastings (https://github.com/ChuckHastings)

URL: #4086
rapids-bot bot pushed a commit that referenced this pull request Jan 17, 2024
…ilities, and optional pool_memory_resource initial size (#1424)

Follow-on to #1417, this PR deprecates the following:

 - `rmm::detail::available_device_memory` in favor of rmm::available_device_memory
 - `rmm::detail::is_aligned`, `rmm::detail::align_up` and related alignment utility functions in favor of the `rmm::` top level namespace versions.
 - The `rmm::pool_memory_resource` constructors that take an optional initial size parameter.

Should be merged after the following:
 - rapidsai/cugraph#4086
 - rapidsai/cudf#14741
 - rapidsai/raft#2088

Authors:
  - Mark Harris (https://github.com/harrism)

Approvers:
  - Michael Schellenberger Costa (https://github.com/miscco)
  - Rong Ou (https://github.com/rongou)

URL: #1424
rapids-bot bot pushed a commit that referenced this pull request Jan 18, 2024
…ool_memory_resource`. (#1392)

Depends on #1417

Adds a new `host_pinned_memory_resource` that implements the new `cuda::mr::memory_resource` and `cuda::mr::async_memory_resource` concepts which makes it usable as an upstream MR for `rmm::mr::device_memory_resource`. 

Also tests a pool made with this new MR as the upstream.

Note that the tests explicitly set the initial and maximum pool sizes as using the defaults does not currently work. See #1388 .

Closes #618

Authors:
  - Mark Harris (https://github.com/harrism)
  - Lawrence Mitchell (https://github.com/wence-)

Approvers:
  - Michael Schellenberger Costa (https://github.com/miscco)
  - Alessandro Bellina (https://github.com/abellina)
  - Lawrence Mitchell (https://github.com/wence-)
  - Jake Hemstad (https://github.com/jrhemstad)
  - Bradley Dice (https://github.com/bdice)

URL: #1392
rapids-bot bot pushed a commit that referenced this pull request Jan 29, 2024
`rmm::available_device_memory` was added and the former `rmm::detail::available_device_memory` was deprecated in #1417. This PR removes the deprecated function.

Closes #1425

Authors:
  - Mark Harris (https://github.com/harrism)

Approvers:
  - Rong Ou (https://github.com/rongou)
  - Bradley Dice (https://github.com/bdice)

URL: #1438
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Breaking change cpp Pertains to C++ code improvement Improvement / enhancement to an existing function Python Related to RMM Python API
Projects
Status: Done
6 participants