-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-117657: Fix small issues with instrumentation and TSAN #118064
Conversation
78f1b64
to
1288090
Compare
@@ -588,9 +589,10 @@ de_instrument(PyCodeObject *code, int i, int event) | |||
return; | |||
} | |||
CHECK(_PyOpcode_Deopt[deinstrumented] == deinstrumented); | |||
*opcode_ptr = deinstrumented; | |||
FT_ATOMIC_STORE_UINT8_RELAXED(*opcode_ptr, deinstrumented); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is relaxed
strong enough? Do any consumers need to see the updates that are sequenced-before this store? Same comment applies to the below instances of using relaxed stores to update bytecode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the deinstrumented case if we race and still see the instrumented opcode the data will still be there. In the instrumented case I think we do need some more synchronization around the tooling metadata, if we see the instrumented opcode before seeing the writes to the data that'll be bad.
I had originally had synchronizeds reads/writes around that in d8ea6bb#diff-adaefb7da847260ef7aff6e66007bc83f5b226c5389a23e1c9ea622c3b02c419 but the thought was that the synchronization around _PyFrame_GetCode(frame)->_co_instrumentation_version
would be sufficient (which is in the latest version of #116775).
But that doesn't actually seem to be good enough - there's still some rare TSAN failures around the instrumentation data. But my plan is to push the synchronization there instead of on the opcode because I don't think we want to do acquire loads on the byte code which would be required to make a release write here meaningful.
1288090
to
d715d18
Compare
Apologies for my cluelessness. Does this have any effect on a non-free-threading build? Or do those macros do nothing in that case? |
The macros do nothing in the non-free-threaded build (i.e., they are just plain C load/stores). |
d715d18
to
5493306
Compare
5493306
to
2429aaf
Compare
…on#118064) Small TSAN fixups for instrumentation
- Fix a few places where we were not using atomics to (de)instrument opcodes. - Fix a few places where we weren't using atomics to reset adaptive counters. - Remove some redundant non-atomic resets of adaptive counters that presumably snuck as merge artifacts of python#118064 and python#117144 landing close together.
There's some races around reading and updating bytecodes with instrumentation. This switches to using atomic operations as these races are all benign.