-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add API flag for whether an AQL queue should be allocated in device memory. #269
Comments
yeah, I got dragged elsewhere and sort of paused on this end. Yes, we would need a new API that takes flags. I had a conversation with ROCr owner about this, but need to start that over again |
That'd be wonderful! I haven't yet gotten my code working with it as I'm getting GPU faults when HSA_ALLOCATE_QUEUE_DEV_MEM=1, but that's likely my issue (or an issue with some of my queues needing to be in host memory and some in device memory). Thanks for pushing on this: I'm round-tripping through system memory for queue operations and it's taking an eternity :) |
i've looped you in on an internal chat |
@saleelk or @dayatsin-amd are either of you taking a look at this? If not, I can create a PR for this. |
Related to this, currently the |
@saleelk I'm taking a look at this and it seems fairly straightforward to add an API-based queue allocation mechanism for getting the queue buf in dev mem, additionally, using the same finegrain allocation method seems to work for the queue struct itself. If I create a PR to support this that removes the ENV variable would that be ok? Or are you already relying on the ENV variable? If so I can just mark it deprecated so future users rely on the API and not the ENV var. |
In 3baaa6e @saleelk added the ability to allocate an AQL queue in device memory via the
HSA_ALLOCATE_QUEUE_DEV_MEM
environment variable. I'd like to be able to use this programmatically with a queue creation flag as some of my queues are better in system memory and others in device memory (particularly those where I'm producing for the queue on the same device consuming it). In addition, as a middleware layer I don't control the environment and options that are set via them are generally not accessible.hsa_queue_create doesn't seem to have a flags arg and I'm not sure of the best route for adding one such that creation could be controlled. Maybe an hsa_amd_queue_create?
The text was updated successfully, but these errors were encountered: