-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ill-typed unused FFI declarations can cause UB #46188
Comments
I haven't actually seen this in practice, but in brendanzab/gl-rs#438 I think someone has (erroneously) declared |
Some questions:
|
triage: P-medium We should fix this, but not a burning issue. |
Note that this is not restricted to I wondered whether, in some cases, LLVM could merge the attributes of the declarations with the definition, introducing UB in safe Rust code like this: fn foo(x: *mut u8) { ... }
extern "Rust" {
fn foos_mangled_name(x: &mut u8);
} The above link shows that, currently, when merging It is unclear to me whether making the definition |
@nikomatsakis if the incorrect declaration is in a different crate, could it become visible here due to LTO ? If so, then warning against this would need some kind of whole-program analysis. |
my review of current status: running the initial example in the playground, it looks like a warning is emitted. running the example through MIRI also in the playground does not produce any additional warnings besides that the allocation is too large. I see that in the rust-lang/unsafe-code-guidelines issue which mentions this one there is interesting discussion, but neither the docs in unsafe-code-guidelines nor the rustonomicon ffi page appear to mention this in the currently built versions of their docs. it doesn't appear that there's a ready-to-implement suggested fix in either this thread or that thread. |
Miri checks the UB listed here, and so far there is no Rust-level UB related to these invalid declarations. So arguably, right now, the code is fine but the LLVM backend handles it incorrectly. However, I am not sure if there is any way for the LLVM backend to prevent LLVM from "merging" these prototypes, or if there is any hope of adjusting LLVM to not do that any more. Ideally, if "merging" has to happen, the merged definition would have the minimal assumptions of both inputs, and if the inputs are incompatible, that should just be an error. |
Is this still an issue? It looks like since 1.63, rust no longer removes the branch for the shown example https://godbolt.org/z/n9PYEKexh |
I don't see any reason to assume that this changed in any fundamental way. Examples that show UB are always fragile, but LLVM to my knowledge still assumes the one-definition-rule. |
rust-lang/unsafe-code-guidelines#198 discusses the issue. There isn't a "one definition rule" applicable here; the issue is about compatible/composite declarations and comes from LLVM's C language semantics. I'm not going to repeat everything I said there, but it's all relevant to this. It is surprising that adding an unused declaration affects the compilation of a Rust program. However, unused declarations affecting the compilation are explicitly allowed by ISO C compatible declaration rules. It seems like rustc could avoid telling LLVM about unused declarations to avoid subjecting Rust programs to those rules. Or, rustc could just emit a different stronger warning for seemingly-unused |
How do you define unused declarations? Are they unused if there are no references, if all referencing functions are dead code, or if all references aren't used during the current execution (but may be during another execution)? |
Focusing on unused declarations seems very fragile. The case of two used declarations with different attributes is just as dangerous, where one the attributes of one declaration might be applied to the unsuspecting other declaration. |
rust-lang/unsafe-code-guidelines#198 (comment) explains why this is especially dangerous if things are really operating as described, using an example of how one can publish a crate with no |
There's a lang team thing about handling this by having people put |
Cranelift doesn't allow multiple function declarations in the same module with different signatures (at clif ir level) at all. It returns an |
@bjorn3 I think "same module" matters for non-LTO builds but for LTO builds then all the declarations in the whole program matter. |
What about this? rust/library/std/src/sys/unix/args.rs Lines 207 to 212 in 6c9c2d8
Or does that not count because they have different names (it would be pretty easy to imagine code that didn't use different names for this, though...) |
I opened a PR for this: #128247. However, I think it's only a partial solution... I have no idea what to do for the general case, where there are two imports of the same function with arbitrarily different signature. See the PR description. |
Turns out this might be too hard for me to fix. I am also increasingly skeptical whether LLVM will cope well with functions declared as Nominating for t-compiler discussion to see what they are thinking about how to best resolve this issue. We may also want to re-consider the priority of this, as 4 years ago when the priority was set, it wasn't yet clear what the actual issue is. |
@rustbot label: +I-lang-nominated The proposed fixes for this issue might not practical, at least not without some help from the LLVM side, so we should consider what we want to do about this as a language. Members of T-compiler in particular suspect that merely issuing a hard error in response to a pair of FFI declarations with distinct types may not suffice; we suspect there are users who are relying on being able to do that, and there is some underlying notion of "compatible" pairs of types that is being used to justify such code. |
The next plan for #128247 is to try declaring functions with an "empty" signature (no arguments, The main concern with that approach seems to be that LLVM's special treatment for specific library functions relies on having the signature in the declaration. |
So... what happens when I import the same symbol name as an extern static and as a function? Right now, it seems like the static "wins":
This code is obviously UB, but if I make the static a 1-ZST there's not actually going to be any memory access there. |
Note that the cranelift backend fundamentally cannot represent such different signatures. I am not sure if there's any way to use a scheme like "declare all functions as taking no arguments and then define the rest for each call site" there -- @bjorn3? On the other hand, given cross-crate monomorphization, it is really hard to ensure that we'll never get this cranelift error -- we have to ensure that the signature is consistent (at least insofar as that would affect cranelift function types) across the entire crate graph. If two crates use signatures that are "meaningfully" different (whatever exactly that means), they may work fine in isolation but the program may then fail to build when they are both used in the same dependency tree. That's pretty bad... |
For functions it would be possible to declare them with an empty signature and then after referencing it in a function overwrite the signature for this reference, but that would require knowing all function definitions in the codegen unit in advance (as the definition still needs to match the declaration), which is incompatible with anything like LTO as there you did have to make the choice for function signature in one module and when you merge it with another module containing the definition the function signature has to match. For declaring a function and and a static with the same name, this hack wouldn't work either. You can't reference a function/static as the other type from a module. |
I am confused. What does "overwrite" mean here? What I meant above (what #128247 does in LLVM) is to always keep the declaration with an empty signature, it never gets changed or "overwritten" in that sense. But when we call the function we give it some arguments anyway and that works. Maybe this could be encoded in cranelift by taking the global function, casting the function pointer to a different type that matches what this call site uses, and then calling that? |
In Cranelift IR a You can't use an empty signature for a declaration of a function that will be locally defined however. There is a check that the function signature of the |
That seems to contradict your previous statement where you said "after referencing it in a function overwrite the signature for this reference", so I am still confused. Can the per-function import of the per-module table have a different type than the per-module table or not? If no then this entire thing doesn't seem relevant here as anyway it can't be used. And even if it can, one could have calls of two different imports of the same external function inside one function so a per-caller type signature doesn't help -- we need a per-call-site signature. (Though it would help insofar as it would make the signature consistency check more local, at least if we ignore inlining.) But I take it the upshot is that cranelift tries to be somewhat typesafe and if a function is declared with a given signature, you have to call it like that. However, presumably it is also possible to create function pointers and cast them to a different type, so that way it'd still be possible to declare them with one signature but call them with another? |
Incorrect static declarations can also introduce immediate undefined behaviour, even when they are never used at runtime. From LLVM reference:
An example where incorrect size of an unused static introduces undefined behaviourextern {
// Suppose that the actual definition is zero sized.
static S: usize;
}
// Requires: a == 0
#[inline(never)]
pub unsafe fn f(mut a: usize) -> usize {
let mut s = 0;
while a != 0 {
s += unsafe { S };
a -= 1;
}
s
} Is optimized to: define noundef i64 @f(i64 noundef %0) unnamed_addr #0 {
start:
%_4 = load i64, ptr @S, align 8
%_4.fr = freeze i64 %_4
%1 = mul i64 %_4.fr, %0
ret i64 %1
} |
Does this affect linkme? The In other words, is this still a problem if the static is never accessed, even in dead code? (It only has its address taken to find out where the section ends). |
Yes, such a static is UB even if never accessed. Statics act basically like globally available references that must be always-dereferenceable.
|
Functions would be represented something like the following in clif ir:
Here
However if you want to also import "imported_function" as static, you did get an error as "imported_function" is already declared as function. In addition if you were to merge the above module with a module containing a definition of "imported_function":
that wouldn't be possible as you get two different declarations of "imported_function". This is not much of an issue currently as Cranelift currently doesn't benefit from LTO (which would be the main reason to want to merge modules) at all due to all optimizations being strictly within a single function, but it might in the future and I don't want to exclude support for it.
Cranelift has separate instruction for calling a function directly and calling a function pointer. The latter is less efficient as it doesn't allow directly encoding the target address in the call instruction. (By the way cranelift doesn't have typed function pointers, it only has pointer-sized integers. Every use of the call_indirect instruction has to specify the function signature.) |
Okay, so encoding calls with the "wrong" signatures via indirect calls works but is inefficient. Makes sense. Maybe it could still be done just for the very rare case where there's a signature mismatch, but it's not pretty. It also seems likely that cranelift will outright reject merging such modules if they have a signature mismatch rather than introducing such casts. From a Rust perspective, the cranelift signature captures only a small part of what makes up a type, so we could allow some mismatches but not arbitrary mismatches. It would probably have to boil down to something like, the signatures have to be ABI-compatible. The other point regarding the "signature check" you mention is something I haven't even considered yet for the LLVM case -- merging two modules where one module defines a function and the other one uses it. In that case we'll always want the definition to win. I assume the LLVM encoding via But IIUC the upshot is, non-ABI-compatible imports of the same symbol anywhere in the crate graph are likely never going to be fully supported in cranelift (as that would require cranelift merging modules with incompatible signatures, and doing the required adjustments to make the result a well-defined module). |
…try> codegen: do not set attributes on foreign function imports Currently we are setting all the usual attributes when translating an FFI import, like ```rust extern { pub fn myfun(x: NonZeroI32) -> &'static mut (); } ``` into LLVM. However, if the same function is imported multiple times as different items in Rust, that all becomes a single LLVM declaration. The attributes set on any one of them may then apply to calls done via other imported items. For instance, the import above, *even if it is never called*, can introduce UB if there is another part of the program that also imports `myfun`, calls it via that import, and passes 0 or `myfun` returns NULL. To fix that, this PR changes the function declarations we emit to all look like `declare `@fn()`.` The one exception are Rust's own allocator functions, which have a bunch of attributes that LLVM currently only honors in the declaration, not the call site -- though with llvm/llvm-project@1953629 we can maybe avoid that in the future. The hope is that this fixes rust-lang#46188. The only cases I can still think of where we emit different declarations with the same name and one of them "wins" are: - Rust's native allocation functions. Even then, the "wrong" declarations will just have no argument and return value so I don't think this should cause problems. - Two `extern static` with the same name but different type, or an `extern static` with the same name as an imported function. Again I doubt the "type of the static" will lead LLVM to deduce things about the function or vice versa, so probably the worst that can happen is LLVM crashes. (And of course if this static actually ends up resolving to a function, that's UB unless the static has size 0. And conversely, if it resolves to a static instead of code then calling the function is UB.) Fixes rust-lang/miri#3581 by making this not UB. Cc `@nikic`
…try> codegen: do not set attributes on foreign function imports Currently we are setting all the usual attributes when translating an FFI import, like ```rust extern { pub fn myfun(x: NonZeroI32) -> &'static mut (); } ``` into LLVM. However, if the same function is imported multiple times as different items in Rust, that all becomes a single LLVM declaration. The attributes set on any one of them may then apply to calls done via other imported items. For instance, the import above, *even if it is never called*, can introduce UB if there is another part of the program that also imports `myfun`, calls it via that import, and passes 0 or `myfun` returns NULL. To fix that, this PR changes the function declarations we emit to all look like `declare `@fn()`.` The one exception are Rust's own allocator functions, which have a bunch of attributes that LLVM currently only honors in the declaration, not the call site -- though with llvm/llvm-project@1953629 we can maybe avoid that in the future. The hope is that this fixes rust-lang#46188. The only cases I can still think of where we emit different declarations with the same name and one of them "wins" are: - Rust's native allocation functions. Even then, the "wrong" declarations will just have no argument and return value so I don't think this should cause problems. - Two `extern static` with the same name but different type, or an `extern static` with the same name as an imported function. Again I doubt the "type of the static" will lead LLVM to deduce things about the function or vice versa, so probably the worst that can happen is LLVM crashes. (And of course if this static actually ends up resolving to a function, that's UB unless the static has size 0. And conversely, if it resolves to a static instead of code then calling the function is UB.) Fixes rust-lang/miri#3581 by making this not UB. Cc `@nikic`
…try> codegen: do not set attributes on foreign function imports Currently we are setting all the usual attributes when translating an FFI import, like ```rust extern { pub fn myfun(x: NonZeroI32) -> &'static mut (); } ``` into LLVM. However, if the same function is imported multiple times as different items in Rust, that all becomes a single LLVM declaration. The attributes set on any one of them may then apply to calls done via other imported items. For instance, the import above, *even if it is never called*, can introduce UB if there is another part of the program that also imports `myfun`, calls it via that import, and passes 0 or `myfun` returns NULL. To fix that, this PR changes the function declarations we emit to all look like `declare `@fn()`.` The one exception are Rust's own allocator functions, which have a bunch of attributes that LLVM currently only honors in the declaration, not the call site -- though with llvm/llvm-project@1953629 we can maybe avoid that in the future. The hope is that this fixes rust-lang#46188. The only cases I can still think of where we emit different declarations with the same name and one of them "wins" are: - Rust's native allocation functions. Even then, the "wrong" declarations will just have no argument and return value so I don't think this should cause problems. - Two `extern static` with the same name but different type, or an `extern static` with the same name as an imported function. Again I doubt the "type of the static" will lead LLVM to deduce things about the function or vice versa, so probably the worst that can happen is LLVM crashes. (And of course if this static actually ends up resolving to a function, that's UB unless the static has size 0. And conversely, if it resolves to a static instead of code then calling the function is UB.) Fixes rust-lang/miri#3581 by making this not UB. Cc `@nikic` try-job: x86_64-msvc try-job: i686-msvc
Turns out this is more tricky than expected -- some calling conventions put the number of arguments into the function name, so LLVM has to know the actual signature. Not sure if LLVM can be fixed to use the arguments at the call site for this, rather than the arguments at declaration site. But it does look increasingly like there's a lot of technical hurdles for importing the same name with different signatures into one module -- this isn't fundamentally anything that should be hard, it's just that LLVM (and cranelift) are making assumptions that make this hard. |
I intend to propose a t-lang design meeting about this problem. Here's the writeup for that: https://hackmd.io/cEi0aL6XSk6DDDASwpVH-w |
Thanks for writing that doc @RalfJung. We discussed during the lang team meeting and my proposal to make incremental progress (which the rest of the team seemed roughly aligned with) is the following:
Does this seem like a reasonable path forward? If so, maybe we can put off using a design meeting for this. The details of the compatibility rules are of course the difficult part, but I'm not sure if T-lang will be very helpful in working those out. I think that has to be informed by a combination of implementation details, the backends we use, the targets we support, and the use cases we discover in the wild. It seems to me that this is a complex enough space that it has to be worked out over time instead of designed all at once. Put another way, if the question is "what definitions of compatibility are we prepared to accept", my answer would be roughly "whatever is most likely to catch actual bugs while staying out of the way in places it isn't helping". I see some value in getting the lang team to sign off on that as a general direction, but doubt we need a full design meeting to do so. |
@tmandry I agree the exact compatibility rules need to be figured out "on the ground". I was mostly asking for t-lang input on the bigger conceptual questions, like -- should we lint/error against the cross-crate case of this? We generally spend a lot of effort to ensure that two different crates that build independently can always be linked together further downstream (e.g., the orphan rule). But here we'd introduce exactly the kind of thing the orphan rule intends to prevent -- two crates work fine in isolation but cannot be used together. OTOH, that is the nature of this UB, so our hands a bit tied. But this seems like a fundamental enough point that I wanted to get some t-lang input. If it doesn't need a design meeting that's fine for me. :) |
Good point. I do agree with what you said here..
And specifically it's because there's a global namespace shared by all (public) symbols linked into a binary. I still think the best thing we can do for now is lint on it, letting the lints be allowed anywhere the conflicting crates are brought together, or at the final linked crate. Since it's allowable we can do this without thinking too hard about the implications, and consider them again if we decide to move to a hard error. |
Okay. :) So the next steps would be:
I don't know how to do the last step -- is there a way to iterate all imports from all crates in the current crate graph? |
This compiles and prints "p is not null and 0x0":
The problem is that we have two declarations of the "malloc" symbol, but LLVM uses a global namespace for these. So during codegen, the 2nd declaration we generate overwrites the first. In this case, the "ill-typed" malloc declaration (
bad::malloc
) comes last, up putting anonnull
attribute onmalloc
, which causesmod good
to be miscompiled.Here's another example that does not involve
malloc
. It does not get miscompiled currently, but it demonstrates the issue.See here for a document that summarizes why this happens, and what our options are.
The text was updated successfully, but these errors were encountered: