-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unnecessary copy when constructing arrays from returned arrays? #62446
Comments
What you're looking for is commonly known as "named return value optimization". It's currently not implemented in rust but progress is being tracked in #32966. In the meantime, if you are willing to take an out-param ( |
LLVM isn't very good at optimizing away memcpy's. I think it would be possible to eliminate one of the memcpy's by extending call slot optimization to work with a destination that is a GEP of alloca. |
Together with a few other changes, https://reviews.llvm.org/D89623 will remove the last memcpy in g1 and g4. It also removes one of the memcpys in g3. g2 would require mutable noalias to be reenabled. |
On beta we generate 1, 2, 2, 1 memcpys. On nightly we generate 1, 2, 1, 0 memcpys. With Not sure why the one memcpy in the first function is still there. It gets dropped when I run |
For
The problem is that we have a clobbering |
The memcpy in the first example should be eliminated by llvm/llvm-project@9080444 and llvm/llvm-project@2902bde (possibly the first one isn't needed) by side-stepping the phase ordering issue. MemCpyOpt happened to implement partial DSE for this particular case already. |
After #87570 we're now down to 0, 0, 1, 0 memcpys. |
I think this is about as fixed as it's going to get, but it may be worthwhile to add some codegen tests for it. |
This now regresses in first 3 cases: https://godbolt.org/z/W6ExP7hdc |
I believe the regression in g2 is actually a correctness fix in that we don't have consensus that the optimization would be valid (cc @RalfJung for another example where we need "spurious store" on |
What's the g2 you are referring to? Some comment above also mentions a g1 and a g4. But the issue description only has a Anyway I added this to rust-lang/unsafe-code-guidelines#133. |
@RalfJung It refers to the godbolt links, e.g. https://godbolt.org/z/W6ExP7hdc. |
Okay so the issue description is extremely outdated then. That's quite confusing when coming anew to this issue. |
I want to construct a big array by concatenating smaller arrays returned by other functions. As a simple example:
Rust nightly generates a call to memcpy.
Is there a way to prevent this memcpy? Am I missing obvious other way to write this function?
Of course one could rewrite the called function
f
to take a&mut [u64]
instead of returning the array, but that removes compile-time checks on the length and introduces bounds checks. Using&mut [u64;40]
as an "out" argument solves that problem, but then I don't see a safe way to get two&mut [u64;40]
into[u64;80]
without usingtransmute
.(Background: I'm implementing the XMSSMT hash-based signature in Rust, which involves concatenating lots of hashes. The usual Rust hash library returns an array (actually a
GenericArray
) instead of using a&mut [u64;...]
parameter which led me to believe that the copy could be optimised away.)The text was updated successfully, but these errors were encountered: