Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert device_memory_resource* to device_async_resource_ref #2269

Merged
merged 6 commits into from
Apr 24, 2024

Conversation

harrism
Copy link
Member

@harrism harrism commented Apr 17, 2024

Description

Closes #2261

For reviewers:
Many of changes are simple textual replace of rmm::mr::device_memory_resource * with rmm::device_async_resource_ref. However there are several places where RAFT used a default value of nullptr for device_memory_resource* parameters. This is incompatible with a resource_ref, which is a lightweight non-owning reference class, not a pointer. In most places, I was able to either remove the default parameter value, or use rmm::mr::get_current_device_resource(). In the case of ivf_pq, I removed the deprecated versions of search that took an mr parameter.

I removed the unused old src/util/memory_pool.cpp and its headers.

@harrism harrism requested review from a team as code owners April 17, 2024 12:14
@harrism harrism added improvement Improvement / enhancement to an existing function breaking Breaking change and removed cpp CMake labels Apr 17, 2024
@tfeher tfeher requested a review from achirkin April 17, 2024 12:36
Copy link
Contributor

@achirkin achirkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! There are plenty build failures atm (see below).

In the meanwhile, what's your opinion regarding get_workspace_resource, which didn't make it into this PR? From the RMM design point of view, shall we better change it to return the resource_ref (thus, losing the information about its type), or keep returning the pointer? Note, we provide extra helpers to get the limits from the limiting adapter.

cpp/bench/ann/src/raft/raft_ann_bench_utils.h Outdated Show resolved Hide resolved
@harrism
Copy link
Member Author

harrism commented Apr 18, 2024

I will have to dig in when I'm back on Monday. It all built successfully on my local copy.

@harrism
Copy link
Member Author

harrism commented Apr 22, 2024

@achirkin I didn't have those failures locally because apparently in the dev container build-raft-cpp -DBUILD_TESTS=ON -DBUILD_BENCHMARKS=ON does not build any TU that includes that header file (benchmarks). Can you tell me how to build RAFT benchmarks, and why RAFT uses a different approach to building them than all other RAPIDS repos?

In the meanwhile, what's your opinion regarding get_workspace_resource, which didn't make it into this PR? From the RMM design point of view, shall we better change it to return the resource_ref (thus, losing the information about its type), or keep returning the pointer? Note, we provide extra helpers to get the limits from the limiting adapter.

I don't want to tackle this until we redesign rmm::set_current_device_resource / rmm::get_current_device_resource, because that redesign can inform this one. FWIW, this is another place where RAFT does things differently than every other RAPIDS library. Consistency would be good.

@harrism
Copy link
Member Author

harrism commented Apr 23, 2024

Figured out how to build benchmarks.

option(BUILD_PRIMS_BENCH "Build raft C++ benchmark tests" OFF)
option(BUILD_ANN_BENCH "Build raft ann benchmarks" OFF)

Copy link
Contributor

@achirkin achirkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've checked that the PR doesn't change the current behavior w.r.t. usage of memory resources anywhere in raft, LGTM. Also thanks for fixing numerous mischiefs like unnecessary dynamic or c-style casts.

I don't want to tackle this until we redesign rmm::set_current_device_resource / rmm::get_current_device_resource, because that redesign can inform this one.

Could you please share, in which direction you plan to redesign these?

FWIW, this is another place where RAFT does things differently than every other RAPIDS library. Consistency would be good.

It may be doing things even more differently pretty soon. Please, have a look at the problem we're trying to solve: #2194 . If you have any suggestions on how we can solve the issues outlined in that PR more in line with other RAPIDS libraries, your feedback would be very appreciated!

cpp/include/raft/neighbors/ivf_flat-inl.cuh Show resolved Hide resolved
Comment on lines +134 to +135
configured_raft_resources(configured_raft_resources&&) = delete;
configured_raft_resources& operator=(configured_raft_resources&&) = delete;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mark this constructor as deleted only because it defaulted to deleted already due to raft::device_resources member, or there was a more specific/design issue?
If the former, I'd prefer to wrap the device_resources into a unique pointer to keep the move constructor (we move it in benchmarks when initializing algo wrappers). But I can do it later in a follow-on PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes exactly. clang-tidy complains that this is explicitly defaulted even though it is implicitly deleted due to the deleted ctors in the res_ member.

If you are going to have these clang-tidy settings, why ignore the warnings? Wrapping the device resources in a unique ptr is up to you -- it's outside of my scope and the scope of this PR.

I think I can revert all changes to this file. Turns out the resource_ref header is no longer used.

Copy link
Member Author

@harrism harrism Apr 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually it should include device_memory_resource.hpp in order to correctly IWYU.

@harrism
Copy link
Member Author

harrism commented Apr 23, 2024

Could you please share, in which direction you plan to redesign these?

Once we have one, yes.

Please, have a look at the problem we're trying to solve: #2194 . If you have any suggestions

Thanks. I commented on #2194.

Copy link
Member

@dantegd dantegd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@cjnolet
Copy link
Member

cjnolet commented Apr 23, 2024

/merge

@rapids-bot rapids-bot bot merged commit 71a19a2 into rapidsai:branch-24.06 Apr 24, 2024
69 checks passed
rapids-bot bot pushed a commit that referenced this pull request May 15, 2024
Re-enable move constructor for the `configured_raft_resources`. It was implicitly deleted before, which was exposed and made explicit in #2269 .
Allowing move semantics here means avoiding an extra unwanted overhead during algorithm preparation in the benchmarks tool.

Authors:
  - Artem M. Chirkin (https://github.com/achirkin)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #2311
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Breaking change CMake cpp improvement Improvement / enhancement to an existing function
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Replace device_memory_resource* with device_async_resource_ref
4 participants