You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
static inline __device__ uint64_t __match_any(int value) {
bool active = true;
uint64_t result = 0;
while (active) {
// determine what threads have the same value as the currently first active thread
int first_active_value = __builtin_amdgcn_readfirstlane(value);
int predicate = (value == first_active_value);
uint64_t m = __ballot(predicate); // THIS LINE IS PROBLEMATIC
// if the current thread has the same value, set its result mask to the current one
if (predicate) {
result |= m;
active = false;
}
}
return result;
}
There used to be compiler bugs making it hard to implement them as with the code above, but they have been fixed.
Feel free to use that code if you want to.
Best regards,
Epliz
The text was updated successfully, but these errors were encountered:
@cjatin , you were kind enough to have a look at #2 ; this one is quite similar, so I would appreciate if you could have a look at this one as well :) .
Epliz
changed the title
Missing warp match functions
Missing warp match functions in HIP
Sep 1, 2023
Hi,
As pointed out at ROCm/hipamd#65 , match_any/match_all are not available in HIP.
These are available in CUDA (cf. https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#warp-match-functions ), and can be implemented on AMD GPUs on Vega+ architectures (such intrinsic corresponds to "WaveMatch" in HLSL shader model 6.5 https://microsoft.github.io/DirectX-Specs/d3d/HLSL_ShaderModel6_5.html#wavematch-function which is supported by Vega+).
Therefore it seems like they can and should be added.
match_any
can for example be implemented as seen at llvm/llvm-project#62477 :There used to be compiler bugs making it hard to implement them as with the code above, but they have been fixed.
Feel free to use that code if you want to.
Best regards,
Epliz
The text was updated successfully, but these errors were encountered: