-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-124715: trashcan protection for all GC objects #124716
base: main
Are you sure you want to change the base?
Conversation
These calls in tp_dealloc methods now need to be removed since _Py_Dealloc always does the untrack for GC objects
Since PyObject_CallFinalizerFromDealloc() re-tracks, this is mostly not needed.
_PyTrash_thread_destroy_chain(tstate); \ | ||
} \ | ||
} while (0); | ||
assert(1); \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this assert(1)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you look at the resulting code and not the diff, there is a comment right above it:
The assert() is present to handle the case that a 'goto' statement precedes Py_TRASHCAN_END.
I think maybe only some compilers complain about it but assert(1)
seemed like an easy fix.
if (!_PyObject_GC_IS_TRACKED(self)) { | ||
_PyObject_GC_TRACK(self); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it safe to rely on this? The Py_DECREF
used in the interpreter does not call _Py_Dealloc
:
Lines 87 to 88 in 35541c4
destructor dealloc = Py_TYPE(op)->tp_dealloc; \ | |
(*dealloc)(op); \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, good catch. I'd say it's not safe given that. I believe Py_DECREF
used to always call _Py_Dealloc
but that's no longer the case. Adding my trashcan logic to this code path is going to be a bit of a perf hit (mostly due to the IS_GC check).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I've fixed this by having the Py_DECREF macro in ceval.c call _Py_Dealloc
. That might be too much of a perf hit though. A bit cheaper would be to do the GC untrack if required but not check the trashcan depth from that macro.
Calling tp_dealloc directly will bypass the trashcan protection and will not preserve the requirement that GC objects are untracked before tp_dealloc is called.
This is an update of my GH-89060 PR, ported to current 'main' branch. The issue and fix are still the same as when the original Roundup bug was created: https://bugs.python.org/issue44897
Integrate the functionality of Py_TRASHCAN_BEGIN and Py_TRASHCAN_END into
_Py_Dealloc
. There are a few advantages:all container objects have the risk of overflowing the C stack if a long reference chain of them is created and then deallocated. So, to be safe, the tp_dealloc methods for those objects should be protected from overflowing the stack.
the Py_TRASHCAN_BEGIN and Py_TRASHCAN_END macros are hard to understand and a bit hard to use correctly. Making the mechanism internal avoids confusion. The code can be slightly simplified as well.
This is a conceptually simple change but there are some subtle issues to take care with. Previous to this change, tp_dealloc was called with the object tracked. After this change, _Py_Dealloc will automatically call untrack for you.
For types using the internal API, they are changed to not call
_PyObject_GC_UNTRACK()
. It's not safe to call that twice. For the public API, callingPyObject_GC_UnTrack()
multiple times is safe and so types that use that API in their tp_dealloc should work without change.Another subtle issue is that if an object ends up resurrected inside tp_dealloc, the object should end up being tracked again. I've changed
PyObject_CallFinalizerFromDealloc()
to do that.This still has the risk of breaking 3rd party extensions. E.g. if they assert that the object is tracked when entering their tp_dealloc, that will now fail. I think that risk is worth the extra robustness Python will gain by having this protection for all GC types.