-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MaybeUninit seems to prevent RVO in even the most trivial cases. #90595
Comments
https://godbolt.org/z/eq45ejr8T via @Thinkofname observes that the |
Is this possibly the same as https://bugs.llvm.org/show_bug.cgi?id=47114 |
Another similar, albeit not identical, issue with |
This optimization is illegal on the LLVM level because |
Isn't unwinding in the foreign -> Rust direction UB? At the very least, Rust's unwind tables are set up (last time I checked, I could be wrong) so that unwinding through Rust jumps to a handler that aborts. Seems like this is a pessimization accounting for a case that cannot happen. ( Edit: hang on, how is this even disallowed at the IR level in the first place? Not only does it optimize correctly in the non-MaybeUninit case, but it also does so when using This is the IR in question but where I've just swapped out MaybeUninit for Foo, and then passed through I'm not sold on unwinding being the real culprit here, but it's not clear whether rustc is emitting poor IR or LLVM is doing something a posteriori dumb. (I spent a while fighting with Alive2 to decide whether or not I believed unwinding was the problem and decided to go to sleep instead.) |
It is UB, but currently only behind the
If the function first writes to its argument and then unwinds, without the call slot optimization the write would not be visible. With the optimization the write becomes visible. Now, in this particular case the destination is an |
Ah! That makes sense. I figured that because it was marked Still doesn't explain why the version without MaybeUninit doesn't have this problem, though. I didn't read the -O0 IR too closely, but perhaps rustc emits sufficiently different IR for that function that it winds up not mattering? Or is this just a matter that the |
Without MaybeUninit some MIR optimization takes care of this (try |
I started an llvm-dev thread on the topic: https://lists.llvm.org/pipermail/llvm-dev/2021-December/154164.html Unfortunately, while looking into this I found a major bug in call slot optimization: The check introduced in llvm/llvm-project@703e488 does not look through pointer casts, and thus rarely triggers in practice. I'm afraid that once this is fixed (or once we have opaque pointers) this may commonly block the optimization for rust in practice. |
There is a regression with nightly compared to 1.64, the memcpy is back with |
Goes away with |
Upstream patch: https://reviews.llvm.org/D135886 |
The aforementioned regression (for panic=abort) is fixed with the LLVM 16 upgrade on nightly. |
…t, r=nikic Add codegen test for RVO on MaybeUninit Codegen test for rust-lang#90595. Currently, this only works with `-Cpanic=abort`, but hopefully in the [future](https://www.npopov.com/2024/01/01/This-year-in-LLVM-2023.html#writable-and-dead_on_unwind) it should also work in the presence of panics. r? `@nikic`
…t, r=nikic Add codegen test for RVO on MaybeUninit Codegen test for rust-lang#90595. Currently, this only works with `-Cpanic=abort`, but hopefully in the [future](https://www.npopov.com/2024/01/01/This-year-in-LLVM-2023.html#writable-and-dead_on_unwind) it should also work in the presence of panics. r? ``@nikic``
Rollup merge of rust-lang#119555 - Kobzol:maybeuninit-rvo-codegen-test, r=nikic Add codegen test for RVO on MaybeUninit Codegen test for rust-lang#90595. Currently, this only works with `-Cpanic=abort`, but hopefully in the [future](https://www.npopov.com/2024/01/01/This-year-in-LLVM-2023.html#writable-and-dead_on_unwind) it should also work in the presence of panics. r? ``@nikic``
Set writable and dead_on_unwind attributes for sret arguments Set the `writable` and `dead_on_unwind` attributes for `sret` arguments. This allows call slot optimization to remove more memcpy's. See https://llvm.org/docs/LangRef.html#parameter-attributes for the specification of these attributes. Fixes rust-lang#90595.
Set writable and dead_on_unwind attributes for sret arguments Set the `writable` and `dead_on_unwind` attributes for `sret` arguments. This allows call slot optimization to remove more memcpy's. See https://llvm.org/docs/LangRef.html#parameter-attributes for the specification of these attributes. In short, the statement we're making here is that: * The return slot is writable. * The return slot will not be read if the function unwinds. Fixes rust-lang#90595.
Set writable and dead_on_unwind attributes for sret arguments Set the `writable` and `dead_on_unwind` attributes for `sret` arguments. This allows call slot optimization to remove more memcpy's. See https://llvm.org/docs/LangRef.html#parameter-attributes for the specification of these attributes. In short, the statement we're making here is that: * The return slot is writable. * The return slot will not be read if the function unwinds. Fixes rust-lang#90595.
Set writable and dead_on_unwind attributes for sret arguments Set the `writable` and `dead_on_unwind` attributes for `sret` arguments. This allows call slot optimization to remove more memcpy's. See https://llvm.org/docs/LangRef.html#parameter-attributes for the specification of these attributes. In short, the statement we're making here is that: * The return slot is writable. * The return slot will not be read if the function unwinds. Fixes rust-lang/rust#90595.
Set writable and dead_on_unwind attributes for sret arguments Set the `writable` and `dead_on_unwind` attributes for `sret` arguments. This allows call slot optimization to remove more memcpy's. See https://llvm.org/docs/LangRef.html#parameter-attributes for the specification of these attributes. In short, the statement we're making here is that: * The return slot is writable. * The return slot will not be read if the function unwinds. Fixes rust-lang/rust#90595.
I tried this code:
The generated assembly at -Oz on x86 is:
Observe that Rust (or LLVM, as the case may be) fails to RVO
x
when usingMaybeUninit
, ironically having potentially worse performance than the one that callsmemset
.Complete godbolt example: https://godbolt.org/z/c1ccf7WeK
Here is the pertinent optimized IR:
Skimming the -O0 IR doesn't help much, since Rust needs to generate a whole mess of code in the MaybeUninit version. The thing that's very mysterious to me is that, somehow, the
bitcast
in@_ZN7example3Foo15new_from_uninit17h20aebee91382058eE
is acting as a barrier for merging%x
with the return slot%0
??This seems like an LLVM bug but filing here first in case it's bad IR codegen on rustc's part.
The text was updated successfully, but these errors were encountered: