-
Notifications
You must be signed in to change notification settings - Fork 356
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix weak memory emulation to avoid generating behaviors that are forbidden under C++ 20 #4057
Conversation
093d86f
to
55a7f3e
Compare
src/concurrency/weak_memory.rs
Outdated
//! that C++20 intended to copy (<https://plv.mpi-sws.org/scfix/paper.pdf>); a change was | ||
//! introduced when translating the math to English. According to Viktor Vafeiadis, this | ||
//! difference is harmless. So we stick to what the standard says, and allow fewer behaviors.) | ||
//! - SC fences are treated like RMWs to a global clock, to ensure they induce enough |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More specifically, they are treated as AcqRel RMWs right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, good point, I should say that.
My main reservation here is that I believe this is effectively the same SC fence treatment as proposed by Promising Semantics. And previously (when I remembered all of these stuff better than I do now) I said #2301 (comment)
There could be an example involving both SC fences and SC accesses that exhibits C++20-forbidden behaviour. I think it's worth asking Viktor and Michalis what they think. I'll try to come up with such an example as well. Though in any case, I'm happy for this to be merged, but without promising that we're fully C++20-compatible for programs that have both SC fences and SC accesses |
Thanks for taking a look!
When I asked them, they said this should be enough, but I didn't ask specifically about the interactions. If a thread does first an SC fence and then an SC access, they do interact via the local clock after all, so in my understanding that should be enough. We're generally not promising to be bug-free so I think this should suffice. I can ask Michalis specifically about this again next time I see him. |
Note that SC fence also serve as a release+acquire fence, so you might need a bit more. |
We do have this: miri/src/concurrency/data_race.rs Lines 870 to 893 in 4e7f53a
But it doesn't quite have the right order, compared to the paper. I guess we have to move the release fence to after the SC logic. Is it possible to construct an example that is broken due to this? |
perhaps this one:
with preceding F-rel the behavior is allowed |
To my great surprise Miri does indeed produce that behavior. :) I didn't think we could be that weak. And #4076 fixes it. Thanks a lot! While you are here @orilahav , could you quickly confirm whether there is anything particular about the interaction of SC fences and SC accesses that we need to be worried about? Like, are they in some sort of global order whose effect changed with C++20? |
I'm not sure what you mean exactly.
Their axioms are similar to what we mentioned in the RC11 paper to be too strong Hans Boehm knows about this issue, and I belive it is planned to be fixed at some point.
disallowed in RC11 But:
racy in RC11
allowed in RC11 I hope my examples are correct. If you draw conclusions from them, we need to check more carefully :) |
Oh gosh I didn't realize it was so bad.^^
What I mean is this: we are looking for a model (supporting all fences and all accesses) with the following properties:
Our starting point is this paper. However, C++20 has forbidden some behaviors that were previously allowed, so this violates the first property above. To fix that we have done two changes:
The question is, is this enough? We changed SC accesses and SC fences, but maybe the interplay of the two changed in more subtle ways in C++20 and that could still lead to us generating behaviors that C++20 does not allow. |
I suspect the first example in my previous comment shows a problem in your approach, right? (More generally speaking, I don't really understand why one should care about C++20 when we know it is broken.) |
I have a hard time understanding bare examples like that, so the risk that I may be able to draw my own conclusions from them is rather low. ;)
That looks exactly like the kind of example @cbeuw was worried about, thanks! I can this test 1000 times and never saw the bad behavior. So I think we are fine. I'll have to figure out why, though...
Because it's the model Rust officially subscribes to... because developing our own model is hard and also we need to be able to share memory with C/C++ programs. |
The first example is prevented in our implementation: miri/src/concurrency/weak_memory.rs Lines 346 to 352 in 0603597
|
So here's the explanation:
|
IIUC, this check was already present in the original POPL paper? #2512 only changed the comments here, not the logic, as far as I can tell. |
This is a strange reason in my mind. C++20 compilation to x86 is currently wrong!
Then, you do more than the RMW encoding you mentioned, bcs an RMW (+ rel/acq fences) won't do that (right?). |
Here are more complicated examples to try (both disallowed in RC11 IIUC):
|
This is a strange reason in my mind. C++20 compilation to x86 is currently wrong!
That's much more important than unsoundness in Miri, and there is no reason to copy wrong C++20 to Miri.
You can use RC11 instead of C++20.
Sorry, I should have clarified: this is not a Miri decision. Rust documents that we use the C++20 memory model. Miri should match that.
Maybe we should track this x86 problem somewhere... Is there a document or blog post or so that describes the problem?
Then, you do more than the RMW encoding you mentioned, bcs an RMW (+ rel/acq fences) won't do that (right?).
In that case it's hard for me to judge correctness, bcs it is not clear to me what is the declarative model your algorithm follows.
Yeah, we also do whatever the POPL17 paper mentioned above does.
Sadly I do not know of a well-documented operational model we could just use, so we are patching that POPL17 model instead. And until the C++ model situation is resolved I'd rather not spook crate authors with alerts I cannot explain...
|
This is what I was able to track: |
Miri seems to not generate either of them. I think I understand the first of these tests (it checks the combination of SC accesses/fences being ordered with a release-acquire pair), but I don't get the second one. The lonely relaxed store in thread 3 is particularly confusing. Could you explain a bit what kind of a situation this test is targeting? |
Okay I think I understand why that execution is impossible, I added some comments in #4090 (the test name there is |
After discussion with Viktor Vafeiadis and Michalis Kokologiannakis, we concluded that the only thing Miri still needs to do to be sound wrt C++20 is treat SC fences as RMWs to a global location -- i.e., have a global unique clock that each RMW releases to and acquires from. This is unnecessarily strong (it rules out some behaviors that would be legal), but it is currently unclear how to do better, and we still pass all our tests for generating weak behaviors so hopefully we'll still get good test coverage for other cases.
@cbeuw does this make sense to you?
Fixes #2301