-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Int128 support on Windows [llvm bug] #3081
Comments
Attached is c code for demonstrating the issue. Observe that gcc (correctly) passes the argument to // clang -S test.c -O2 -m64 -o test-clang.s
// x86_64-w64-mingw-gcc -S test.c -O2 -m64 -o test-gcc.s
int __attribute__((noinline)) test(__int128 x) {
return x+19432;
}
int main() {
return test(23521);
} gcc generated assembly (correct): .file "test.c"
.text
.p2align 4,,15
.globl test
.def test; .scl 2; .type 32; .endef
.seh_proc test
test:
.seh_endprologue
movq (%rcx), %rax
addl $19432, %eax
ret
.seh_endproc
.def __main; .scl 2; .type 32; .endef
.section .text.startup,"x"
.p2align 4,,15
.globl main
.def main; .scl 2; .type 32; .endef
.seh_proc main
main:
subq $56, %rsp
.seh_stackalloc 56
.seh_endprologue
call __main
leaq 32(%rsp), %rcx
movq $23521, 32(%rsp)
movq $0, 40(%rsp)
call test
addq $56, %rsp
ret
.seh_endproc
.ident "GCC: (GNU) 4.8.1 20130329 (prerelease)" clang generated assembly (incorrect): .def test;
.scl 2;
.type 32;
.endef
.text
.globl test
.align 16, 0x90
test:
.cfi_startproc
leal 19432(%rcx), %eax
ret
.cfi_endproc
.def main;
.scl 2;
.type 32;
.endef
.globl main
.align 16, 0x90
main:
.cfi_startproc
pushq %rbp
.Ltmp2:
.cfi_def_cfa_offset 16
.Ltmp3:
.cfi_offset %rbp, -16
movq %rsp, %rbp
.Ltmp4:
.cfi_def_cfa_register %rbp
callq __main
movl $23521, %ecx
xorl %edx, %edx
popq %rbp
jmp test
.cfi_endproc |
It would be good to have tests to catch all these issues that we keep discovering on Windows. With the effort to use Intel C++ composer on Windows, it would be nice to not fall into the same trap again. |
@vtjnash I am not sure that your assembler listings confirm the issue. Windows 64-bit and Linux 64-bit are using different calling convention. So, comparing output of Clang and MinGW is not fair. If you check with GCC (not MinGW) you will get the same result as Clang (I just checked). We need somehow to verify that LLVM generates code incorrectly. Also, it can be related to this LLVM issue |
@vharavy what makes you think I'm not running clang for a windows 64 target? This should be a completely fair comparison. To prove my point, if you look more carefully the assembly generated by clang (and gcc) is PE not ELF -- meaning it was generated to run on 64-bit windows (just not correctly). I am well aware of the difference in calling conventions and looked up the windows calling convention before declaring gcc to be correct (e.g. you should get the same result if you test with the Intel or Microsoft compilers). Demonstrating that this is not a bug that exists on your Linux box is not productive. I could have used gcc without the mingw prefix on my windows box in the same way I used clang. This does demonstrates that LLVM generates incorrect code on Windows. I'm also not sure how the inability of llvm to generate a certain assembly instruction optimization would affect its choice of calling convention (although I had already read through the list of missing and broken features in the source code to check). |
@vtjnash OK, I apologize. I did not understood your post clearly. There is bug fixing phase of LLVM and the release 3.3 is schedule in about 3 weeks. We may try to check this new version to see if the problem was resolved. Here is the link to LLVM 3.3 SVN branch. |
@vharavy no need to apologize, I was simply making the observation that the outputs are not the same even beyond the superficial differences (including the usage of difference registers). It might be helpful if you could confirm that the Intel and Microsoft compilers are compliant with the windows standard here. Out of curiosity, do those compilers have (I was hoping to avoid the need to compile clang 3.3 for windows, and was just going to submit it as an issue after working out my thoughts here. But perhaps drangon's script https://code.google.com/p/mingw-w64-dgn/ will be able to help me) |
@vtjnash Unfortunately, neither Microsoft compilers nor Intel C++ Composer for Windows does not define type typedef struct _int128_t
{
__int64 hi;
__int64 lo;
} int128_t;
int __declspec(noinline) test(int128_t x)
{
return x.lo + 19432;
}
int main()
{
int128_t x;
x.hi = 0;
x.lo = 23521;
return test(x);
} produces the following result for Microsoft compiler (VS 2012 Update 2): test:
mov rax,qword ptr [rcx+8]
add rax,4BE8h
ret
main:
sub rsp,38h
mov qword ptr [rsp+20h],0
mov qword ptr [rsp+28h],5BE1h
movaps xmm0,xmmword ptr [rsp+20h]
lea rcx,[rsp+20h]
movdqa xmmword ptr [rsp+20h],xmm0
call 000000013F7A1000
add rsp,38h
ret and for Intel C++ Composer 2013 Update 2 for Windows: main:
sub rsp,40h
xor ecx,ecx
mov rax,qword ptr [3F943000h]
xor rax,rsp
mov qword ptr [rsp+30h],rax
call 000000013F9418A0
xor eax,eax
lea rcx,[rsp+20h]
mov edx,5BE1h
mov qword ptr [rcx],rax
mov qword ptr [rcx+8],rdx
call 000000013F941050
mov rcx,qword ptr [rsp+30h]
mov r15d,eax
xor rcx,rsp
call 000000013F941070
mov eax,r15d
add rsp,40h
pop r15
ret
test:
mov eax,dword ptr [rcx+8]
add eax,4BE8h
ret As you may see, in both cases P.S. Intel C++ Composer for Windows very tightly integrates with Microsoft compilers. It cannot be used without having Microsoft SDK or some version of Visual Studio installed, and while compiling it defines both |
I believe the typename is supposed to be _m128 http://msdn.microsoft.com/en-us/library/94z15h2c.aspx x86_64 is little endian, so the lo byte should be first. Taking that into account the assembly seems to agree with GCC. typedef struct _int128_t
{
__int64 lo;
__int64 hi;
} int128_t; Were you able to check for whether the compiler defines |
Nevermind, I was able to confirm that |
@vtjnash It seems not. There are some x64 intrinsics like
In regard of |
The spec is very clear and consistent (although it's a bit hard to find all of the pieces). _m128 is the same as the struct you created above. http://msdn.microsoft.com/en-us/library/zthk2dkh.aspx |
Yes, I can see that now. It is defined as follows typedef union __declspec(intrin_type) _CRT_ALIGN(16) __m128 {
float m128_f32[4];
unsigned __int64 m128_u64[2];
__int8 m128_i8[16];
__int16 m128_i16[8];
__int32 m128_i32[4];
__int64 m128_i64[2];
unsigned __int8 m128_u8[16];
unsigned __int16 m128_u16[8];
unsigned __int32 m128_u32[4];
} __m128; P.S. By the way, I don't really see why Julia need support of 128-bit integers. My opinion is everything except basic integer and floating point types, arrays and strings should be moved into separate packages. I like the way R does - you can specify which packages you need during start-up - the will be loaded automatically, other packages can always be loaded manually. Julia defines almost everything in Base and thus have 6 megabyte |
Complaining about our documentation in this context is totally gratuitous.
|
@JeffBezanson You may agree, that there is a lot of things to worry about (and to work at) besides 128-bit integers. Anyway, it was just a cry from the heart - it is needed to be done, we will get it done. |
submitted upstream as http://llvm.org/bugs/show_bug.cgi?id=16168 |
This is actually not terribly hard to fix. I'll have a go at it. |
@loladiro fixing arbitrary int128 functions is easy (i assume you already have that on your branch). the annoying part is that all of the llvm math intrinsics are calling the functions with the wrong ABI, and we don't have a way to control that, short of reimplementing them |
I know what's involved and where to fix the llvm intrinsics. |
This patch fixes the tests.
However, this would also need to be in |
By some cruel trick, my copy of GCC had managed to exactly copy the return values where LLVM thought they should be, making this patch succeed only on my machine. But in fact, LLVM is unable to model the int128 return type on windows and so will typically fail even with this patch -- another patch will be forthcoming. |
I'm still seeing failing numbers tests on windows. Should this have fixed it? |
did you force llvm to rebuild? (cd deps/llvm-3.3/build_Release && make install) |
I had but somehow it didn't propagate. I just rebuilt everything from scratch though and it worked fine |
oh, and you need to delete libjulia |
Windows 64-bit segfaults when attempting
div
,mod
, orrem
on 128-bit numbers (it tries to call the gcc intrinsic __umodti3). This is because LLVM does not use the correct calling conventions for 128-bit numbers on windows.The text was updated successfully, but these errors were encountered: