-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Issue]: standard math operations on vector types fail to build with GNU g++ #74
Comments
can you share you |
Version:
Machine:
This supports
A possible alternative to the |
Hi @Oblomov, internal ticket has been created to investigate your issue. Thanks! |
Hi @Oblomov thank you for reaching out! Is there a specific use case you have in mind that would require g++ instead of clang? The solution you provided would work, but it would treat the vectors as simple vectors instead of SIMD vectors, which can result in significant performance gaps. Hence, it is hard for us to justify supporting g++ at all. |
Hello @tcgu-amd, thanks for getting back to me on this issue. I first came across the issue while adding support for AMD HIP to GPUSPH (you can find some of the preliminary work in that direction in the GPUSH is a very complex project with several optional dependencies that aren't always well-supported by device compilers, so the host and device side of the software are normally built with separate compilers (the default host compiler, which is usually Some vector types are used on host too (even if not for computationally intensive operations), so the header needs to compile successfully regardless of compiler usage. I would argue that this should be the case regardless of my specific use case, and even if this leads to suboptimal performance on host. I would like to stress that I'm referring here specifically to the host side, since the device side would still be compiled with clang-based hipcc and thus still use the That being said, I have just run some simple tests and A trivial example is this: #include <hip/hip_vector_types.h>
#include <iostream>
__attribute__((noinline))
float4 doit(float4 const& v1, float4 const& v2)
{
return v1*v2;
}
int main(int argc, char* argv[]) {
const float c1 = argc > 1 ? atof(argv[1]) : 1.0f;
const float c2 = argc > 2 ? atof(argv[2]) : 2.0f;
const float c3 = argc > 3 ? atof(argv[3]) : 3.0f;
const float c4 = argc > 4 ? atof(argv[4]) : 4.0f;
const float4 v1{c1, c2, c3, c4};
const float4 v2{c4, c3, c2, c1};
float4 mul = doit(v1,v2);
std::cout << mul.x << " " << mul.y << " " << mul.z << " " << mul.w << std::endl;
} with the header patched as suggested in the issue description. Looking at the disassembly of the 00000000000013e0 <_Z4doitRK15HIP_vector_typeIfLj4EES2_>:
13e0: 66 0f 6f 07 movdqa (%rdi),%xmm0
13e4: 0f 28 16 movaps (%rsi),%xmm2
13e7: 0f 59 d0 mulps %xmm0,%xmm2
13ea: 0f 29 54 24 e8 movaps %xmm2,-0x18(%rsp)
13ef: f3 0f 7e 4c 24 f0 movq -0x10(%rsp),%xmm1
13f5: f3 0f 7e 44 24 e8 movq -0x18(%rsp),%xmm0
13fb: c3 ret
13fc: 0f 1f 40 00 nopl 0x0(%rax) |
@Oblomov Thanks for adding the context! I was not aware that gcc autovectorizes loops; this is neat. I will create an internal ticket to investigate adding this patch to the source code. |
@Oblomov Good news! Seems like the functionality you suggested is already implemented! It was part of this commit d4799b2 and has been present since rocm 6.1.x https://github.com/ROCm/clr/blob/rocm-6.1.x/hipamd/include/hip/amd_detail/amd_hip_vector_types.h#L553. I just ran your reproducer on 6.2 and it was working. Would you mind trying on 6.2 (or 6.1) and let us know the result? Thanks! |
Hello @tcgu-amd thank you very much, I have tested 6.1.0 and 6.2.1 and the issue seems to be fixed indeed. Thank you very much, I'm closing the issue since I consider it solved for me. |
Problem Description
The standard math operators don't work on vector types as defined by
amd_hip_vector_types.h
when usingg++
as host compiler.Operating System
Debian trixie/sid
CPU
AMD Ryzen 7 PRO 6850H with Radeon Graphics
GPU
AMD Instinct MI300X, AMD Instinct MI300A, AMD Instinct MI250X, AMD Instinct MI250, AMD Instinct MI210, AMD Instinct MI100, AMD Radeon Pro W7900, AMD Radeon Pro W6800, AMD Radeon Pro V620, AMD Radeon Pro VII, AMD Radeon RX 7900 XTX, AMD Radeon RX 7900 XT, AMD Radeon VII
ROCm Version
ROCm 6.0.0
ROCm Component
HIP
Steps to Reproduce
A simple test case reproducing the issue is:
This builds correctly with
hipcc
orclang++
, but not withg++
, where the following error is shown instead:(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support
No response
Additional Information
This is most likely due to the way
__NATIVE_VECTOR__
gets defined, which in this case becomesThe operators need to be redefined to something like
and similarly for all other operators.
Some notes:
this issue has been present since HIP version 4.2 at least and is still present in 6.0.3
I had previously submitted a patch to fix this and a number of other issues to the old
ROCm/hipamd
repository.this is independent from the specific GPU, it's a header-level issue.
The text was updated successfully, but these errors were encountered: