-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FakeRead is semantically meaningful to miri (but gets optimized away) #97092
Comments
It should be fine for optimizations to remove UB, as long as Miri runs without said optimization. The UB will still occur in the AM version of the code, and Miri's ability to catch this UB depends on whether it runs before or after the optimization. |
Miri would (I think) currently ICE on |
After opening this issue, I was thinking that -- indeed -- I overstated the case, but indeed the point stands that a (I would also like to have some kind of sanitizer mode where we do UB detection, and in that mode, FakeRead wouldn't really be fake, but since that mode doesn't exist outside of miri, it's ok to optimize them away.) |
Updated title / OP. |
The issue title makes it sound like you're trying to say that optimizations which remove FakeRead are buggy, but I don't think that's the case based on the other comments here? Then your latest comment makes it sound like this issue is actually a call for separating MIR for analysis from MIR for code generation. I guess I'm just confused because in an issue with only 5 comments it seems like the consensus on what change would close the issue has taken a huge turn. Are you asking for a wholly separate IR? Or a compiler API to get an IR which can't be combined with optimizations? (because while Miri sets |
I think we should remove optimizations that hide UB at opt level 0. I assume that the only optimizations that run at opt level 0 are those that must run for the latter stages to not ICE, but ideally anything that hides UB should not be on that list. |
I think a user overriding the opt level in Miri is fine, as long as the user is aware that this might miss UB. But indeed by default Miri should be run on MIR without any UB-removing optimizations? |
So, I do agree that there is an issue here, but I don't think the issue is the one that is described. I for some reason forgot to include it in the MIR docs, but my working model of
|
At first I thought that perhaps there was confusion because of miri potentially processing things after optimization. But I think you are saying: what should the official operational semantics of a FakeRead be?
If so, it is *meant* to be equivalent to `&x` -- i.e., a logical read of all reachable content -- at least in terms of its borrow check equivalent (I'm not totally sure how stacked borrows manages that kind of thing at present).
There is also, I believe, a "shallow" variant of fake read too.
…On Sun, Jun 5, 2022, at 6:45 PM, Jakob Degen wrote:
So, I do agree that there is an issue here, but I don't think the issue is the one that is described. I for some reason forgot to include it in the MIR docs, but my working model of `FakeRead` has been that it is a nop at AM runtime. If that is the case, it is effectively just an analysis artifact and certainly sound to remove. However, this issue makes a good point - it seems like we want match exhaustiveness checking to be "sound" in the sense that in all match statements (in non-UB programs) accepted by the compiler, one of the arms must actually be taken. This leaves us with two problems:
1. `FakeRead` does need to have an actual behavior at runtime, but it's not clear to me what that behavior should be. It certainly can't be an actual read, because then we might go reading into an `UnsafeCell` and `Option<Mutex<u8>>::is_some` would cause a data race if another thread currently holds the lock. What we actually want is to somehow assert that the reference is still valid at that point, without doing any reading. Possibly a regular old SB retag would suffice, maybe we need to insert a protector?
2. As Ralf and I discovered on Zulip <https://rust-lang.zulipchat.com/#narrow/stream/136281-t-lang.2Fwg-unsafe-code-guidelines/topic/SB.20UnsafeCell.20.2B.20exhaustiveness.20check.20.2B.20layout.20opts> yesterday, even if we fix the above problem, shared references to enums are actually enough to change the active variant in some cases under current SB. This means that the above soundness theorem would be false even if `FakeRead` worked the way we wanted it to. This is proving to be somewhat hard to fix too. In any case, that should not necessarily block this issue, I did feel it was worth mentioning though.
—
Reply to this email directly, view it on GitHub <#97092 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AABF4ZSFHHPJ6UP7CYKBMCDVNUUZ3ANCNFSM5WC5UE2A>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
|
Yes, that is what I am saying.
So, straying a little bit off-topic, but what I was getting at above is that under SB a "logical read" and a I would like to add an additional note that I mentioned on Zulip once but I think deserves a little more permanance: I mentioned above that we will likely want to prove a theorem that says that the match exhaustiveness checking that we do, together with the aliasing model and match guards that we insert, are enough to guarantee that all matches are indeed exhaustive. If we want to prove such a theorem though (and have our language semantics rely on it), that means we are now imposing lower bounds on the strength of the aliasing model. In the conversation about allowing the 2PB future incompatibility lint, if I recall correctly it was decided that SB's inability to model it was non-blocking, because SB could always just do less strict alias analysis. However, if it is true that we require lower bounds on the strength of the aliasing model for the reasons described above, that means just doing less strict aliasing analysis is not always an option. I unfortunately cannot come up with a practical example here, and it may not be possible, but still I think this is a non-trivial concern. |
I am inclined to close this as being a part of rust-lang/unsafe-code-guidelines#399 -- it's less about FakeRead specifically and more about the op.sem of all the operations that |
Given this sample program:
the desugaring of the match intentionally adds some borrows when executing the guard:
The purposeof the
FakeRead
of_4
and_6
is to ensure that the discriminant does not change while the guard executes -- i.e., that the guard doesn't (via unsafe code, say) mutatex
to beNone
. But after optimizations those FakeReads are removed:this means that
foo()
could trigger writes without causing UB. This seems bad!UPDATE: This is overstating the case. It's ok for us to optimize FakeRead away, but tools like miri or some future sanitizers would still want to see them (for the reasons give above).
The text was updated successfully, but these errors were encountered: