-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recycle storage after move #61849
Comments
There's further discussion in #59123 (comment), I wish we could just fork the thread. |
@eddyb writes
Yes, I'd find it rather strange to have this encoded in the Perhaps surprisingly, I have much less strong opinions about surface language semantics than about MIR semantics.^^ We'd still want to make sure of course to make that translation not too weird -- magic UB-inducing MIR instructions like |
Right, the confusing thing was that all along I've assumed you were against I consider the MIR more flexible and even its "place-oriented non-SSA CFG" nature to be subject to change, whereas the surface-level semantics to be the important thing to discuss, UB-wise. In the short term, to keep things simple, I wouldn't mind a Long-term, if we go in this direction, I would expect us to be shrinking storage ranges, although I'm not sure what to do about the simple notion that a |
Oh I see. No I am not necessarily opposed to forbid unsafe code from doing that. I just think the way we forbid it should be as clean and principled as we can make it.
That's fair. On the other hand, I think surface language semantics of a language as complex as Rust is best explained by elaboration into some simpler language. And I think an idealized version of MIR is a reasonable candidate for the target language of this elaboration. |
…, r=matthewjasper Don't store locals that have been moved from in generators This avoids reserving storage in generators for locals that are moved out of (and not re-initialized) prior to yield points. Fixes rust-lang#59123. This adds a new dataflow analysis, `RequiresStorage`, to determine whether the storage of a local can be destroyed without being observed by the program. The rules are: 1. StorageLive(x) => mark x live 2. StorageDead(x) => mark x dead 3. If a local is moved from, _and has never had its address taken_, mark it dead 4. If (any part of) a local is initialized, mark it live' This is used to determine whether to save a local in the generator object at all, as well as which locals can be overlapped in the generator layout. Here's the size in bytes of all testcases included in the change, before and after the change: async fn test |Size before |Size after -----------------|------------|---------- single | 1028 | 1028 single_with_noop | 2056 | 1032 joined | 5132 | 3084 joined_with_noop | 8208 | 3084 generator test |Size before |Size after ----------------------------|------------|---------- move_before_yield | 1028 | 1028 move_before_yield_with_noop | 2056 | 1032 overlap_move_points | 3080 | 2056 ## Future work Note that there is a possible extension to this optimization, which modifies rule 3 to read: "If a local is moved from, _**and either has never had its address taken, or is Freeze and has never been mutably borrowed**_, mark it dead." This was discussed at length in rust-lang#59123 and then rust-lang#61849. Because this would cause some behavior to be UB which was not UB before, it's a step that needs to be taken carefully. A more immediate priority for me is inlining `std::mem::size_of_val(&x)` so it becomes apparent that the address of `x` is not taken. This way, using `size_of_val` to look at the size of your inner futures does not affect the size of your outer future. cc @cramertj @eddyb @Matthias247 @nikomatsakis @RalfJung @Zoxc
…jasper Don't store locals that have been moved from in generators This avoids reserving storage in generators for locals that are moved out of (and not re-initialized) prior to yield points. Fixes #59123. This adds a new dataflow analysis, `RequiresStorage`, to determine whether the storage of a local can be destroyed without being observed by the program. The rules are: 1. StorageLive(x) => mark x live 2. StorageDead(x) => mark x dead 3. If a local is moved from, _and has never had its address taken_, mark it dead 4. If (any part of) a local is initialized, mark it live' This is used to determine whether to save a local in the generator object at all, as well as which locals can be overlapped in the generator layout. Here's the size in bytes of all testcases included in the change, before and after the change: async fn test |Size before |Size after -----------------|------------|---------- single | 1028 | 1028 single_with_noop | 2056 | 1032 joined | 5132 | 3084 joined_with_noop | 8208 | 3084 generator test |Size before |Size after ----------------------------|------------|---------- move_before_yield | 1028 | 1028 move_before_yield_with_noop | 2056 | 1032 overlap_move_points | 3080 | 2056 ## Future work Note that there is a possible extension to this optimization, which modifies rule 3 to read: "If a local is moved from, _**and either has never had its address taken, or is Freeze and has never been mutably borrowed**_, mark it dead." This was discussed at length in #59123 and then #61849. Because this would cause some behavior to be UB which was not UB before, it's a step that needs to be taken carefully. A more immediate priority for me is inlining `std::mem::size_of_val(&x)` so it becomes apparent that the address of `x` is not taken. This way, using `size_of_val` to look at the size of your inner futures does not affect the size of your outer future. cc @cramertj @eddyb @Matthias247 @nikomatsakis @RalfJung @Zoxc
To further elaborate on the kinds of optimizations we could do with this.. If a local
Then we could optimize away its storage after it's moved from. Rust will enforce that any borrow is dead after the move, of course, and transmuting
|
(1) follows from (2). But my concern is that we should specify (2) in an operational way -- that is, we can't just say "the address may change", we have to describe the operational behavior that causes it to change. Something like |
Yes, agreed. In theory, this should also help us answer the question of "who are we breaking with this change?" (Ideally no one, but probably someone.) In practice, though, it doesn't seem feasible to run miri on crater because of slowness. Maybe if we could produce a smaller list of crates likely to break. All crates that use
|
I'd like to propose adding something like
We can also go back and add optimizations for non-generator code, but that's a step we would want to be more careful about. First we would add these This would create a nice "ratcheting" of semantics by helping us keep our options open for generators today, and putting out a canary for non-generator code in the future. The only problem I see with it is, there are certain unsafe patterns (which should be rare in practice, if they exist at all) which will pass validation outside of generators but fail inside of them. We'd like to discourage any of these patterns anyway, so I don't see this as a big problem. @rust-lang/wg-async-await |
@tmandry the semantics would be the same as
So you plan to only insert this for generators? That's not great, but I can understand why. |
Sorry this was unclear. The main reason to introduce a new kind of statement it is so that it can be ignored in codegen, and any MIR passes which wish to be conservative. Then over time we can start using it in more passes. Specifically I don't want LLVM relying on this yet, since it could break unsafe code in the wild. Performance is a secondary benefit. |
What kind of code would be broken by this? A concrete example would be nice, ideally with some indication of whether anything like this has been seen in the wild. |
Any code which does something like this: let x = String::from("hello");
let x_ptr = &mut x as *mut String;
std::mem::drop(x);
unsafe { *x_ptr = String::from("goodbye"); } I suppose reading the bytes would be too, but I think that's already UB. I don't know of any code which does this in the wild, but it may well exist. I don't know of an easy way to check for it today.
After a local is moved from. Today, just in generators (both to be conservative, and because adding it elsewhere is likely to regress performance). This is basically your suggestion to emit |
I see, makes sense. Having Miri test for this (everywhere, not just in generators) would be a good first step towards finding at least some uses in the wild, I guess. |
I do think it'd be great to know whether this pattern exists in the wild, but I also feel pretty strongly that we are going to want to call it UB in the end. It seems like this comes up fairly regularly when discussing possible optimizations. In any case, I'm in favor of moving forward with steps towards allowing us to recycle memory, but I don't really think this has to be done before/after async-await stabilizes, as I think this kind of code falls pretty squarely into the "grey area" of unsafe code that could well become invalid. Do we know of even one real instance of such code? |
In case the question was directed at me, I am totally fine making that code UB. I'd just hope we can have it fail in Miri before we do optimizations based on it. ;) |
In terms of recycling storage, I learned some strange things about LLVM:
That would mean that even with |
Code generation of StorageRefresh as a pair of lifetime.end and lifetime.start in LLVM makes it possible to detect use after move with AddressSanitizer and MemorySanitizer. That would be quite useful to canvas the usage in the wild and detect bugs. AddressSanitizer detecting use after move
#[derive(Debug)]
struct Wrapper<T: ?Sized>(T);
fn main () {
let mut w = Wrapper("A");
let p: *const _ = &w as *const _;
println!("{:?}", unsafe { &*p});
drop(w);
println!("{:?}", unsafe { &*p});
} Running `target/x86_64-unknown-linux-gnu/debug/use-after-move`
Wrapper("A")
=================================================================
==184053==ERROR: AddressSanitizer: stack-use-after-scope on address 0x7ffe619d05c0 at pc 0x563fd541589f bp 0x7ffe619ce700 sp 0x7ffe619ce6f8
READ of size 8 at 0x7ffe619d05c0 thread T0
#0 0x563fd541589e in <&str as core::fmt::Debug>::fmt library/core/src/fmt/mod.rs:2033:71
...
#13 0x563fd4f65f9d in use_after_move::main src/main.rs:9:5
...
Address 0x7ffe619d05c0 is located in stack of thread T0 at offset 384 in frame
#0 0x563fd4f65a0f in use_after_move::main src/main.rs:4
This frame has 9 object(s):
[32, 40) '_34' (line 9)
[64, 72) '_32' (line 9)
[96, 112) '_31' (line 9)
[128, 176) '_24' (line 9)
[208, 216) '_16' (line 7)
[240, 248) '_14' (line 7)
[272, 288) '_13' (line 7)
[304, 352) '_6' (line 7)
[384, 400) 'w' (line 5) <== Memory access at offset 384 is inside this variable
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
(longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-use-after-scope library/core/src/fmt/mod.rs:2033:71 in <&str as core::fmt::Debug>::fmt Potential utility for LLVM stack coloring to reduce stack usage is a different question. As far as I can see, in the current implementation of stack coloring, such a pair of intrinsics doesn't shorten a live range. In fact, it could do the opposite. If an alloca is used with more than a single pair of lifetime intrinsics, the alloca is considered to be "degenerate" and processed in more conservative fashion. The effective live range of "degenerate" alloca starts with lifetime.start, while for non-"degenerate" allocas it is shortened towards the first use (which itself seems hard to explain operationally, but that's for another discussion). |
We should experiment with "re-allocating" storage for a local after it's moved from if it gets re-initialized. This would enable more optimizations, but could have some potential fallout.
EDIT: See this comment for further explanation of what kinds of optimizations this would enable.
Quoth @RalfJung, from #59123 (comment):
It is possible to do a subset of the optimization discussed in #59123 without this, but this would help cover more cases, in this optimization and others.
cc @cramertj @eddyb @nikomatsakis @RalfJung
The text was updated successfully, but these errors were encountered: