-
Notifications
You must be signed in to change notification settings - Fork 546
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Issue]: assert doesn't appear to be implemented on CUDA backend #3719
Comments
@neworderofjamie the HIP implementation is operating as designed where an assert by the device communicates with the host at the point of the problem and then traps much as assert works in host code. |
I cannot believe that that is true as:
|
I just found ROCm/hipother#1 - I think this is the cause of the bug. |
Hi @neworderofjamie, you are right, if you don't include any assertion library, asserts will be expanded according to the PR that you linked. Using $ hipcc -std=c++11 -E hip.cc | grep -n -C4 "rowLength < 4" 56747- 56748-# 8 "hip.cc" 3 4 56749- { if (! 56750-# 8 "hip.cc" 56751: rowLength < 4 56752-# 8 "hip.cc" 3 4 56753- ) { { asm("trap;"); }; } } 56754-# 8 "hip.cc" 56755- ; The condition If you add $ hipcc -std=c++11 -E hip.cc | grep -n -C4 "rowLength < 4" 56755- 56756-# 8 "hip.cc" 3 4 56757- (static_cast ( 56758-# 8 "hip.cc" 56759: rowLength < 4 56760-# 8 "hip.cc" 3 4 56761- ) ? void (0) : __assert_fail ( 56762-# 8 "hip.cc" 56763: "rowLength < 4" 56764-# 8 "hip.cc" 3 4 56765- , __builtin_FILE (), __builtin_LINE (), __extension__ __PRETTY_FUNCTION__)) 56766-# 8 "hip.cc" 56767- ; then it works the same way as your cuda code. I'll work with the HIP team to get the PR merged. |
@neworderofjamie you're right. I misunderstood the question. And thanks for the patch...not sure why this was not observed earlier. I do want to reiterate that behavior of asserts (when correctly triggered) from HIP kernels running on ROCm will be different than Cuda kernels running on Nvidia and that difference is by design. |
Thanks, @zichguan-amd,
Which, I read to mean that you only needed to include hip/hip_runtime.h for device asserts. @b-sumner, can you clarify what you mean about the behaviour of asserts? Unlike in the CUDA documentation, the behaviour is not described so this would be good to know. |
What is the intended purpose of the macro-based assert? I am now trying to apply the fix to our actual code and it turns out we are including |
If you trace down the include headers from
Device-side implementation implements I guess the question is why in |
Problem Description
using
assert
in HIP doesn't work with the CUDA backendOperating System
Ubuntu 20.04.6 LTS (Focal Fossa)
CPU
Intel(R) Xeon(R) Gold 6134 CPU @ 3.20GHz
GPU
NVIDIA RTX A5000
ROCm Version
ROCm 6.2.4
ROCm Component
HIPCC
Steps to Reproduce
compile and run the following with CUDA:
using
Observe that the kernel runs correctly and that it prints "CUDA 2 < 4" 32 times.
compile and run the following with HIPCC using the CUDA backend:
using:
and observe that while "HIP 2 < 4" is printed 32 times, the kernel exits with a unspecified launch failure.
(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support
No response
Additional Information
If you look at the generated PTX, the reason for this seems clear. When compiling with nvcc you see:
which appears to be a branch on the condition
%r1 < 4
and some sort of function call to assert. However, when compiling with hipcc, there is no conditional code, just a trap:The text was updated successfully, but these errors were encountered: