-
Notifications
You must be signed in to change notification settings - Fork 204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Consider creating a pinned host memory resource that can leverage all of the allocation strategies #618
Comments
Since pinned host memory is accessible as device memory, I think we should create a |
This issue has been marked rotten due to no recent activity in the past 90d. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. |
This issue has been marked stale due to no recent activity in the past 30d. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be marked rotten if there is no activity in the next 60d. |
This is still relevant, but now depends on NVIDIA/libcudacxx#105 |
This issue has been labeled |
Maybe relevant to cupy/cupy#4892? |
This issue has been labeled |
This issue has been labeled |
I'll take a crack at this since we also need it for Spark. |
If you proceed ahead of |
Is there an ETA on |
I thought we were close before Christmas. But we keep running into more design issues. |
Admittedly there may not be a clear answer to this, but are there ways for others interested in this to help pull |
This issue has been labeled |
Chatted with @jrhemstad offline, one option is to move the current |
No need to move it, just make a new one that inherits from |
This issue has been labeled |
This issue has been labeled |
I have been playing with this for a test rapidsai/cudf#14314 in cuDF. I think what @jrhemstad said here around the APIs being different between host and device allocations is important. Actually, why would we want Just for reference, what I did was I hacked out the streams and had a single I do think we want to invest time on this given the results I am seeing in rapidsai/cudf#14314, so I'd be curious to know if anyone has cycles to take a look at this. |
Because pinned host memory is accessed by device copies and kernels, which are stream ordered. Any non-stream-ordered allocation would have to synchronize, losing a lot of the benefits of stream-ordered allocation. I think it would be a mistake not to provide a stream-ordered pinned host memory pool.
I am planning to work on this because I am also finding it is needed to improve spilling performance for Python. Current plan is to do it once we adopt cuda::memory_resource, because it turns the This said, we should also look into timing and cost (including opportunity cost) and
|
…ool_memory_resource`. (#1392) Depends on #1417 Adds a new `host_pinned_memory_resource` that implements the new `cuda::mr::memory_resource` and `cuda::mr::async_memory_resource` concepts which makes it usable as an upstream MR for `rmm::mr::device_memory_resource`. Also tests a pool made with this new MR as the upstream. Note that the tests explicitly set the initial and maximum pool sizes as using the defaults does not currently work. See #1388 . Closes #618 Authors: - Mark Harris (https://github.com/harrism) - Lawrence Mitchell (https://github.com/wence-) Approvers: - Michael Schellenberger Costa (https://github.com/miscco) - Alessandro Bellina (https://github.com/abellina) - Lawrence Mitchell (https://github.com/wence-) - Jake Hemstad (https://github.com/jrhemstad) - Bradley Dice (https://github.com/bdice) URL: #1392
Is your feature request related to a problem? Please describe.
I wish I could allocate a pool of pinned host memory for allocations that will often times be short lived. An example of when we would use this is when sending information between two nodes using the tcp protocol. We have to copy the memory to host in order to send it over the wire. Having a pinned memory buffer would make this more efficient.
Describe the solution you'd like
Allow the memory providers to either specify specific low level allocators or some way of being able to do this for just the specific case of pinned memory.
Describe alternatives you've considered
We currently have a fixed size pinned memory pool that we use for this situation.
The text was updated successfully, but these errors were encountered: