-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LLVM assertion when using i128
with the NVPTX
#38824
Comments
This may have been caused by the i128 PR ( #38482 ). No LLVM assertion is raised if I compile We could try to fix this or we could cfg away the i128 stuff for the nvptx targets in the meantine. The LLVM assertions render the NVPTX backend unusable in the nightlies. cc @est31 @nagisa Have you seen any LLVM assertion like the ones above while working on implementing i128? |
@japaric no, I can't remember seeing such an assertion. It seems to be in PTX related code. but its very likely that i128 broke some of the non tier1 platforms. You can only tell whether they break once they actually do :/ Can you try to generate a minimum reproducible example? Then we might know more... |
Seems like the msp430 target broke too:
with {
"arch": "msp430",
"asm-args": ["-mcpu=msp430"],
"data-layout": "e-m:e-p:16:16-i32:16:32-a:16-n8:16",
"executables": true,
"linker": "msp430-elf-gcc",
"llvm-target": "msp430",
"max-atomic-width": 0,
"no-integrated-as": true,
"os": "none",
"panic-strategy": "abort",
"relocation-model": "static",
"target-endian": "little",
"target-pointer-width": "16",
"vendor": "unknown"
} Using cc @pftbest |
core
to PTXcore
for NVPTX / MSP430
@est31 Examples #![crate_type = "lib"]
#![feature(i128_type)]
#![feature(lang_items)]
#![feature(no_core)]
#![no_core]
fn foo() -> i128 {
0
}
#[lang = "copy"]
trait Copy {}
#[lang = "sized"]
trait Sized {}
#![crate_type = "lib"]
#![feature(i128_type)]
#![feature(lang_items)]
#![feature(no_core)]
#![no_core]
fn foo(x: i128) {}
#[lang = "copy"]
trait Copy {}
#[lang = "sized"]
trait Sized {}
|
core
for NVPTX / MSP430i128
with the NVPTX and MSP430 targets
msp430 EABI doesn't specify how to return 128 bit values, and LLVM does not support them on msp430, so I don't see an easy fix for this. |
These are tier 3 platforms, so basically anything goes IMO in terms of language guarantees. If they don't support i128 at a fundamental level then it seems like if we want them to compile we'll have to cfg out support. |
I'm trying to understand this better:
I'm wondering if we should:
I personally don't like either option. To me this seems like a limitation in what LLVM can do today but that can be fixed in the future. Also, if we had pure MIR rlibs this wouldn't be a problem at all because the i128 stuff would never be passed to LLVM if it wasn't being used by the final application. To sum it up, I'm not sure what to do here ... |
Right now on MSP430 values from the functions are returned using 4 16bit registers r12-r15. So in total you can return only 64 bits of data. There is an exception though, for returning large structs, the data is placed on the stack and reference is placed in the registers. So I guess it will be possible to bodge some hack that will treat i128 like a struct, but I don't know if anyone tried to implement 128 bit values on 16bit cpu, so it will be a new territory and potentially a lot of work for a feature that may not be useful at all on a small MCU. |
@pftbest yeah, but if LLVM can generate code for a function that returns
How about |
AFAIK, if Rust code calls Rust code, the ABI is totally irrelevant. We can do whatever we want. However, it starts getting a problem once we start talking to non Rust places, e.g. for FFI, or when LLVM emits a call to i128 intrinsics.
Real need for I'm okay with any of the three options (flag in target.json, feature for |
No.
No. Namely the x64 windows ABI is kinda fuzzy about how i128 values are handled so carefully reading it we arrived at a conclusion that they ought to be handled the same way as regular 16-byte structs on 64 bit window platforms.
A target cannot not support i128 at a fundamental level the same way a target cannot not support writing turing complete software.
IMO what we should do here is to fudge the ABIs somewhat to take and return arguments via pointers (or the same way C ABI will be more complex; as @est31 already said there will always be at least one user of the C ABI with i128 arguments and stuff – LLVM with their intrinsics. The question here is how does LLVM lower to the intrinsics on these targets. If the backend does not lower and asserts for whatever reason, the targets have what’s essentially a broken backend implementation. Which seems to be broken for at least NVPTX (test.ll here) in exactly the place where I’d expect it to be broken if it was broken:
That being said, since the C compilers do not support i128 for those targets, the actual All that’s really needs to be fixed is LLVM’s lowering of instructions to intrinsics using whatever ABI that works, even if its passing in arguments and taking return value via pointers. I, myself, am very strongly against splitting the language feature sets depending on the target. |
How do we break the stalemate? If we submitted changes upstream for each of the smaller targets w/o support for i128 (providing indirection like large structs) it seems like it would meet resistance. As @pftbest says, it's likely not considered useful since there's many targets without any native registers or support for such large values. But we also also have reluctance to limit this type to only appropriate targets. Does the "vision for 2017 cycle" provide any guidance? Does "provide a more seamless FFI story" help push us in one direction or the other? |
@androm3da regardless whether useful or not, we do need to support i128 on all targets, otherwise people will start using libraries that provide it across platforms. Even if it needs to be "virtualized" in some way, and not wholly put into registers. In the worst case we'd have to provide some translation About FFI, I guess we can't really provide any FFI for i128 on targets where there is no C support for it, as generally the calling conventions only specify things that are part of C. |
None of the targets Rust supports have native support for 128-bit integers, other than the fact that some can implement unchecked division and multiplication involving 128-bit integer operand with a single insn.
Why? All that needs to be done here is to have the backends fixed. LLVM is supposed to be agnostic over bitwidth of integer, however useful or not somebody on the internet perceives such functionality to be. In other words, its a backend bug.
So, its probably pretty clear that all it takes is to write a patch to fix bugs in these backends wrt operations on large values. I have no knowledge of these targets, am not familiar with the backends and haven’t the hardware to test any fixes on, so these patches will have to be written by somebody else.
Late answer, but I’ve noticed this only now. This observation is wrong in multiple places. Firstly, ¹: For a less obscure example, x64 ABI used by code targetting Windows mandates |
The MSP430 ABI specifies that aggregates larger than 32 bits are passed and returned via pointer. The backend just doesn't implement sret correctly yet (and in fact is implemented targeting an old and outdated ABI in any case). I'm working on fixing this in LLVM. I'd bet that similar problems explain the NVPTX problems but I know nothing about that platform so I can't say that for sure. |
LLVM: Update submodule to include SRet support patch for MSP430. This patch is needed to fix rust-lang#38824 on MSP430. I know that LLVM 4 is coming soon, but it would be great to have at least one working nightly before the update. cc @awygle r? @alexcrichton
LLVM: Update submodule to include SRet support patch for MSP430. This patch is needed to fix rust-lang#38824 on MSP430. I know that LLVM 4 is coming soon, but it would be great to have at least one working nightly before the update. cc @awygle r? @alexcrichton
LLVM: Update submodule to include SRet support patch for MSP430. This patch is needed to fix #38824 on MSP430. I know that LLVM 4 is coming soon, but it would be great to have at least one working nightly before the update. cc @awygle r? @alexcrichton
@pftbest does that LLVM patch really fix NVPTX as well? |
@nagisa No, it doesn't. this should be reopened for NVPTX. |
BTW, thanks to @awygle for fixing this in LLVM upstream. |
i128
with the NVPTX and MSP430 targetsi128
with the NVPTX
Triage: MSP430 has been fixed. No change in the NVPTX backend. |
What needs to be done to get NVPTX working again? |
You can see the patch I did for the MSP430 above - if it's the same problem you basically just have to add sret (structure return) support to the LLVM backend. I don't know anything about NVPTX but I'd bet it's pretty much the same procedure as what I did for the MSP. |
@nathanaeljones Note that you can use the NVPTX backend today with the nightly compiler if you use a fork of the libcore crate that doesn't include 128 bit integers. Instructions here. |
I have some progress on the issue. First, I tried a hacky workaround with replacing define i128 @internal(i128, i128) unnamed_addr #0 {
%a = mul i128 %0, %1
ret i128 %a
} Can be transformed to: define <2 x i64> @internal(<2 x i64>, <2 x i64>) unnamed_addr #0 {
%3 = bitcast <2 x i64> %0 to i128
%4 = bitcast <2 x i64> %1 to i128
%5 = mul i128 %4, %3
%6 = bitcast i128 %5 to <2 x i64>
ret <2 x i64> %6
} And compiled to PTX with vanilla LLVM (
Then, I made less hacky workaround patching LLVM NVPTX backend so it became possible to compile the code with target datalayout = "e-i64:64-v16:16-v32:32-n16:32:64"
target triple = "nvptx64-nvidia-cuda"
define i128 @internal(i128, i128) unnamed_addr #0 {
start:
%a = mul i128 %0, %1
ret i128 %a
}
define ptx_kernel void @foo(i128, i128, i128*) unnamed_addr #0 {
start:
%a = call i128 @internal(i128 %0, i128 %1)
store i128 %a, i128* %2
ret void
}
attributes #0 = { norecurse nounwind readnone } Is compiled with patched
But another %a = mul i128 %0, %1 I write: %a = sdiv i128 %0, %1 I receive an error:
And similar error when I try to compile
Does anybody have any suggestions about the problems? Looks like PTX doesn't have an instructions for the functions, and LLVM can't expand them. |
Looks like the expansion from i128 to i64 is going OK but the i64 sdiv/mulo instructions aren't properly implemented. Not sure why that might be... it kind of looks like it's trying to assign the address of a library call and that's not working. Possibly the way you implemented the i128 expansion prevented a subsequent i64 expansion from running but I can't tell without seeing the code. Run llc with -print-before-all and/or -print-machineinstrs to see where things diverge from what you expect. Look up in your platform ABI what the right action to take for 64-bit divides actually is. Post a link to your changes and someone may be able to take a look at them and see the problem. |
Traditionally, in LLVM, |
@awygle Here is an output of the next commands:
I tried to make changes over LLVM as minimal as possible: diff. @est31 It still require some more changes for LLVM to be able to threat libcalls as normal calls. But then, all we would need is to link against |
For the target. At least its that way on other platforms. What is LLVM doing on 32 bit NVPTX when it encounters a 64 bit operation? On normal targets, this is one of the places compiler-rt gets used. Or idk, to cast a float to an int and back. |
@denzp by the way, I didn't congratulate you yet for your progress. Really great work, loving it! |
I'd like to see this move forward, so I'll summarize the above. Please correct me if I'm wrong:
Is this problem still happening with more recent builds? Perhaps it's been fixed by upstream in the meantime. If it is still a problem, what would have to be done to fix it? |
Well, my patch related to i128 lowering landed in LLVM about year ago. Now we can compile simple examples without any problems: #![feature(lang_items)]
#![feature(no_core)]
#![crate_type = "lib"]
#![no_core]
#[no_mangle]
pub fn foo() -> i128 {
1
}
#[lang = "copy"]
trait Copy {}
#[lang = "sized"]
trait Sized {} with which produces the PTX assembly:
And as for me, the issue can be closed. I was able to compile a kernel from |
This effectively means you can't compile
core
for these targets.STR
Meta
Already Fixed
The text was updated successfully, but these errors were encountered: