Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Never return uninhabited values at all #59639

Merged
merged 1 commit into from
Apr 4, 2019

Conversation

cuviper
Copy link
Member

@cuviper cuviper commented Apr 2, 2019

Functions with uninhabited return values are already marked noreturn,
but we were still generating return instructions for this. When running
with -C passes=lint, LLVM prints:

Unusual: Return statement in function with noreturn attribute

The LLVM manual makes a stronger statement about noreturn though:

This produces undefined behavior at runtime if the function ever does
dynamically return.

We now emit an abort anywhere that would have tried to return an
uninhabited value.

Fixes #48227
cc #7463 #48229

r? @eddyb

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Apr 2, 2019
@sanxiyn
Copy link
Member

sanxiyn commented Apr 3, 2019

@bors r+

@bors
Copy link
Contributor

bors commented Apr 3, 2019

📌 Commit fb575c0 has been approved by sanxiyn

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 3, 2019
Centril added a commit to Centril/rust that referenced this pull request Apr 3, 2019
Never return uninhabited values at all

Functions with uninhabited return values are already marked `noreturn`,
but we were still generating return instructions for this. When running
with `C passes=lint`, LLVM prints:

    Unusual: Return statement in function with noreturn attribute

The LLVM manual makes a stronger statement about `noreturn` though:

> This produces undefined behavior at runtime if the function ever does
dynamically return.

We now mark such return values with a new `IgnoreMode::Uninhabited`, and
emit an `abort` anywhere that would have returned.

Fixes rust-lang#48227
cc rust-lang#7463 rust-lang#48229

r? @eddyb
bors added a commit that referenced this pull request Apr 3, 2019
Rollup of 5 pull requests

Successful merges:

 - #59076 (Include trailing comma in multiline Debug representation)
 - #59619 (wasi: Implement more of the standard library)
 - #59639 (Never return uninhabited values at all)
 - #59643 (std: Upgrade `compiler_builtins` to fix wasi linkage)
 - #59664 (Updated the documentation of spin_loop and spin_loop_hint)

Failed merges:

r? @ghost
Centril added a commit to Centril/rust that referenced this pull request Apr 3, 2019
Never return uninhabited values at all

Functions with uninhabited return values are already marked `noreturn`,
but we were still generating return instructions for this. When running
with `C passes=lint`, LLVM prints:

    Unusual: Return statement in function with noreturn attribute

The LLVM manual makes a stronger statement about `noreturn` though:

> This produces undefined behavior at runtime if the function ever does
dynamically return.

We now mark such return values with a new `IgnoreMode::Uninhabited`, and
emit an `abort` anywhere that would have returned.

Fixes rust-lang#48227
cc rust-lang#7463 rust-lang#48229

r? @eddyb
@eddyb
Copy link
Member

eddyb commented Apr 3, 2019

@bors r- (please do not r+ PRs assigned to me without first trying to ping me on IRC or Discord PM)

@bors
Copy link
Contributor

bors commented Apr 3, 2019

😪 I'm awake I'm awake

@bors
Copy link
Contributor

bors commented Apr 3, 2019

💡 This pull request was already approved, no need to approve it again.

@bors
Copy link
Contributor

bors commented Apr 3, 2019

📌 Commit fb575c0 has been approved by eddyb

@bors bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Apr 3, 2019
@eddyb
Copy link
Member

eddyb commented Apr 3, 2019

Oh bors is just extremely confusable.

@bors r-

Functions with uninhabited return values are already marked `noreturn`,
but we were still generating return instructions for this. When running
with `-C passes=lint`, LLVM prints:

    Unusual: Return statement in function with noreturn attribute

The LLVM manual makes a stronger statement about `noreturn` though:

> This produces undefined behavior at runtime if the function ever does
dynamically return.

We now emit an `abort` anywhere that would have tried to return an
uninhabited value.
@cuviper cuviper force-pushed the ignore-uninhabited branch from fb575c0 to c2e0d7f Compare April 3, 2019 22:58
@cuviper
Copy link
Member Author

cuviper commented Apr 3, 2019

OK, I updated codegen_return_terminator to just check is_uninhabited() directly, instead of using a new IgnoreMode, as @eddyb suggested.

@cuviper cuviper mentioned this pull request Apr 4, 2019
@sanxiyn sanxiyn added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Apr 4, 2019
@eddyb
Copy link
Member

eddyb commented Apr 4, 2019

Thanks!

@bors r+

@bors
Copy link
Contributor

bors commented Apr 4, 2019

📌 Commit c2e0d7f has been approved by eddyb

@bors bors removed the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Apr 4, 2019
@bors bors added the S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. label Apr 4, 2019
@RalfJung
Copy link
Member

RalfJung commented Apr 4, 2019

So we are fine with not letting LLVM remove these branches entirely as being dead code? I recall @cramertj asking for more aggressive dead code removal around uninhabited types.

@sanxiyn
Copy link
Member

sanxiyn commented Apr 4, 2019

As can be seen on codegen tests, we generate unreachable for uninhabited types after this PR. This should allow LLVM to optimize. (Whether LLVM in fact optimizes is a different question though.)

Centril added a commit to Centril/rust that referenced this pull request Apr 4, 2019
Never return uninhabited values at all

Functions with uninhabited return values are already marked `noreturn`,
but we were still generating return instructions for this. When running
with `-C passes=lint`, LLVM prints:

    Unusual: Return statement in function with noreturn attribute

The LLVM manual makes a stronger statement about `noreturn` though:

> This produces undefined behavior at runtime if the function ever does
dynamically return.

We now emit an `abort` anywhere that would have tried to return an
uninhabited value.

Fixes rust-lang#48227
cc rust-lang#7463 rust-lang#48229

r? @eddyb
Centril added a commit to Centril/rust that referenced this pull request Apr 4, 2019
Never return uninhabited values at all

Functions with uninhabited return values are already marked `noreturn`,
but we were still generating return instructions for this. When running
with `-C passes=lint`, LLVM prints:

    Unusual: Return statement in function with noreturn attribute

The LLVM manual makes a stronger statement about `noreturn` though:

> This produces undefined behavior at runtime if the function ever does
dynamically return.

We now emit an `abort` anywhere that would have tried to return an
uninhabited value.

Fixes rust-lang#48227
cc rust-lang#7463 rust-lang#48229

r? @eddyb
@RalfJung
Copy link
Member

RalfJung commented Apr 4, 2019

We generate unreachable after llvm.trap. Since llvm.trap never returns and aborts the program in a defined way, I don't think the unreachable has any effect.

Centril added a commit to Centril/rust that referenced this pull request Apr 4, 2019
Never return uninhabited values at all

Functions with uninhabited return values are already marked `noreturn`,
but we were still generating return instructions for this. When running
with `-C passes=lint`, LLVM prints:

    Unusual: Return statement in function with noreturn attribute

The LLVM manual makes a stronger statement about `noreturn` though:

> This produces undefined behavior at runtime if the function ever does
dynamically return.

We now emit an `abort` anywhere that would have tried to return an
uninhabited value.

Fixes rust-lang#48227
cc rust-lang#7463 rust-lang#48229

r? @eddyb
Centril added a commit to Centril/rust that referenced this pull request Apr 4, 2019
Never return uninhabited values at all

Functions with uninhabited return values are already marked `noreturn`,
but we were still generating return instructions for this. When running
with `-C passes=lint`, LLVM prints:

    Unusual: Return statement in function with noreturn attribute

The LLVM manual makes a stronger statement about `noreturn` though:

> This produces undefined behavior at runtime if the function ever does
dynamically return.

We now emit an `abort` anywhere that would have tried to return an
uninhabited value.

Fixes rust-lang#48227
cc rust-lang#7463 rust-lang#48229

r? @eddyb
Centril added a commit to Centril/rust that referenced this pull request Apr 4, 2019
Never return uninhabited values at all

Functions with uninhabited return values are already marked `noreturn`,
but we were still generating return instructions for this. When running
with `-C passes=lint`, LLVM prints:

    Unusual: Return statement in function with noreturn attribute

The LLVM manual makes a stronger statement about `noreturn` though:

> This produces undefined behavior at runtime if the function ever does
dynamically return.

We now emit an `abort` anywhere that would have tried to return an
uninhabited value.

Fixes rust-lang#48227
cc rust-lang#7463 rust-lang#48229

r? @eddyb
@eddyb
Copy link
Member

eddyb commented Apr 4, 2019

@RalfJung Can you open an issue? It's not clear to me what effect this has, AFAIK we produce an unreachable instead of return (at the MIR level) in "monomorphically noreturn" cases.

bors added a commit that referenced this pull request Apr 4, 2019
Rollup of 8 pull requests

Successful merges:

 - #59470 (Document std::fs::File close behavior ignoring errors)
 - #59555 (update miri)
 - #59556 (update stdsimd)
 - #59596 (Forward formatter settings to bounds of `Range<T>` in `fmt::Debug` impl)
 - #59639 (Never return uninhabited values at all)
 - #59671 (Make some of lexer's API private)
 - #59685 (Add description for -Os and -Oz in rustc.1)
 - #59686 (Temporarily disable stack probing for gnux32.)

Failed merges:

r? @ghost
@bors bors merged commit c2e0d7f into rust-lang:master Apr 4, 2019
@cuviper
Copy link
Member Author

cuviper commented Apr 4, 2019

I'll happily investigate any bad codegen from this. I would think that any paths that would reach one of these really ought to be statically unreachable anyway, such that LLVM will still know that even the llvm.trap is dead code, but I'm curious to see real examples.

@cuviper
Copy link
Member Author

cuviper commented Apr 5, 2019

The nightlies before and after this change are:

$ rustc +nightly-2019-04-04 -V
rustc 1.35.0-nightly (f8673e0ad 2019-04-03)

$ rustc +nightly-2019-04-05 -V
rustc 1.35.0-nightly (53f2165c5 2019-04-04)

I looked at the noreturn examples from #48227 and #48229, and there is no difference in the optimized LLVM IR between the two nightlies -- the new nightly does prune the llvm.traps.

I also did a crude search for ud2 traps in the compiler's own libraries:

$ objdump -d nightly-2019-04-04-x86_64-unknown-linux-gnu/lib/*.so | grep -c 'ud2'
77021

$ objdump -d nightly-2019-04-05-x86_64-unknown-linux-gnu/lib/*.so | grep -c 'ud2'
77636

That's a fair increase, although we can't immediately tell if those are in performance sensitive areas, and they may just be from normal new code. (For more, 2019-04-03 had 76947, and beta 70510.)

Here is the perf report before and after the #59695 rollup PR that included this.

@cuviper cuviper mentioned this pull request Apr 5, 2019
@cuviper
Copy link
Member Author

cuviper commented Apr 5, 2019

@RalfJung

We generate unreachable after llvm.trap. Since llvm.trap never returns and aborts the program in a defined way, I don't think the unreachable has any effect.

It serves as the block terminator, if nothing else.

https://llvm.org/docs/LangRef.html#unreachable-instruction

This can be used to indicate that the code after a no-return function cannot be reached, and other facts.

@RalfJung
Copy link
Member

RalfJung commented Apr 8, 2019

The unreachable is never reached though (heh ;), so it has no effect. You are first calling trap, which is a well-defined function that never returns. Everything after it is dead code and hence does not matter.

I opened an issue at #59793 where I also demonstrate the effect of this.

@RalfJung
Copy link
Member

RalfJung commented Apr 8, 2019

the new nightly does prune the llvm.traps.

Which exact example did you use for this? I cannot reproduce.

@cuviper
Copy link
Member Author

cuviper commented Apr 8, 2019

The unreachable is never reached though (heh ;), so it has no effect. You are first calling trap, which is a well-defined function that never returns. Everything after it is dead code and hence does not matter.

Yes, but a call is not a terminator, so you still need something. As I quoted from the reference, this is a normal use case to have unreachable following a no-return function. If you end a function on a call void @llvm.trap, as I just tried with one of the examples below, you get an error, and even adding tail or musttail is not enough. So unreachable is sort of a no-op terminator in this case.

$ opt -lint -disable-output issue48227.ll
opt: issue48227.ll:2525:1: error: expected instruction opcode
}
^

Now, we could change this to just unreachable, without the trap, if that helps #59793.

Which exact example did you use for this? I cannot reproduce.

From #48227:

#![crate_type = "lib"]

pub fn lines<'a>(left: &'a str) {
    iter(left.lines());
}

fn iter<I, T>(left: I)
where
    I: Clone + Iterator<Item = T> + DoubleEndedIterator,
    T: PartialEq,
{
    let _left_count = left.clone().count();
}
$ rustc +nightly-2019-04-05 --emit=llvm-ir issue48227.rs && grep -C3 trap issue48227.ll
; Function Attrs: noreturn nonlazybind uwtable
define void @"_ZN50_$LT$T$u20$as$u20$core..convert..From$LT$T$GT$$GT$4from17h8496458b8d57254eE"() unnamed_addr #3 {
start:
  call void @llvm.trap()
  unreachable
}

--
declare zeroext i1 @"_ZN42_$LT$$u21$$u20$as$u20$core..fmt..Debug$GT$3fmt17h48bee2ab31956489E"({ [0 x i8] }* noalias nonnull readonly align 1, %"core::fmt::Formatter"* align 8 dereferenceable(96)) unnamed_addr #1

; Function Attrs: cold noreturn nounwind
declare void @llvm.trap() #7

; Function Attrs: nounwind nonlazybind uwtable
declare i32 @memcmp(i8*, i8*, i64) unnamed_addr #5

Optimization removes it:

$ rustc +nightly-2019-04-05 --emit=llvm-ir issue48227.rs -O && grep -C3 trap issue48227.ll

From #48229:

fn main() {
    (1 .. 9).filter(|_| true).sum::<u32>();
}
$ rustc +nightly-2019-04-05 --emit=llvm-ir issue48229.rs && grep -C3 trap issue48229.ll
; Function Attrs: noreturn nonlazybind uwtable
define internal void @"_ZN50_$LT$T$u20$as$u20$core..convert..From$LT$T$GT$$GT$4from17h0f0e4c14c5a910afE"() unnamed_addr #3 {
start:
  call void @llvm.trap()
  unreachable
}

--
declare void @_ZN4core9panicking9panic_fmt17h48d5842cb1c97579E(%"core::fmt::Arguments"* noalias nocapture dereferenceable(48), { [0 x i64], { [0 x i8]*, i64 }, [0 x i32], i32, [0 x i32], i32, [0 x i32] }* noalias readonly align 8 dereferenceable(24)) unnamed_addr #2

; Function Attrs: cold noreturn nounwind
declare void @llvm.trap() #8

; Function Attrs: nonlazybind
define i32 @main(i32, i8**) unnamed_addr #9 {

Optimization removes it:

$ rustc +nightly-2019-04-05 --emit=llvm-ir issue48229.rs -O && grep -C3 trap issue48229.ll

@cuviper
Copy link
Member Author

cuviper commented Apr 8, 2019

FWIW, both of those example traps come from the blanket impl<T> From<T> for T for T = !, which comes from implementing Iterator::fold as an infallible try_fold:

fn fold<B, F>(mut self, init: B, mut f: F) -> B where
Self: Sized, F: FnMut(B, Self::Item) -> B,
{
self.try_fold(init, move |acc, x| Ok::<B, !>(f(acc, x))).unwrap()
}

@RalfJung
Copy link
Member

RalfJung commented Apr 8, 2019

Ah, those are truly unreachable traps get removed -- LLVM statically proves that that line cannot be reached. That has nothing to do with the unreachable that comes after them.

a call is not a terminator, so you still need something. As I quoted from the reference, this is a normal use case to have unreachable following a no-return function.

I see. That's a bit weird but whatever, if that's how LLVM works that's fine for me. ;) My point was that the unreachable here has no codegen effect, it could be replaced by any other terminator. In particular, this does not allow the optimizer to assume that the trap never happens, and use that to eliminate dead control flow branches or so. What you see happen in these examples (I think) is LLVM being able to use other information elsewhere to conclude that these are dead branches.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants