Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-124715: trashcan protection for all GC objects #124716

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

nascheme
Copy link
Member

@nascheme nascheme commented Sep 27, 2024

This is an update of my GH-89060 PR, ported to current 'main' branch. The issue and fix are still the same as when the original Roundup bug was created: https://bugs.python.org/issue44897

Integrate the functionality of Py_TRASHCAN_BEGIN and Py_TRASHCAN_END into _Py_Dealloc. There are a few advantages:

  • all container objects have the risk of overflowing the C stack if a long reference chain of them is created and then deallocated. So, to be safe, the tp_dealloc methods for those objects should be protected from overflowing the stack.

  • the Py_TRASHCAN_BEGIN and Py_TRASHCAN_END macros are hard to understand and a bit hard to use correctly. Making the mechanism internal avoids confusion. The code can be slightly simplified as well.

This is a conceptually simple change but there are some subtle issues to take care with. Previous to this change, tp_dealloc was called with the object tracked. After this change, _Py_Dealloc will automatically call untrack for you.

For types using the internal API, they are changed to not call _PyObject_GC_UNTRACK(). It's not safe to call that twice. For the public API, calling PyObject_GC_UnTrack() multiple times is safe and so types that use that API in their tp_dealloc should work without change.

Another subtle issue is that if an object ends up resurrected inside tp_dealloc, the object should end up being tracked again. I've changed PyObject_CallFinalizerFromDealloc() to do that.

This still has the risk of breaking 3rd party extensions. E.g. if they assert that the object is tracked when entering their tp_dealloc, that will now fail. I think that risk is worth the extra robustness Python will gain by having this protection for all GC types.

These calls in tp_dealloc methods now need to be removed since
_Py_Dealloc always does the untrack for GC objects
Since PyObject_CallFinalizerFromDealloc() re-tracks, this is mostly not
needed.
_PyTrash_thread_destroy_chain(tstate); \
} \
} while (0);
assert(1); \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this assert(1)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you look at the resulting code and not the diff, there is a comment right above it:

The assert() is present to handle the case that a 'goto' statement precedes Py_TRASHCAN_END.

I think maybe only some compilers complain about it but assert(1) seemed like an easy fix.

Comment on lines +535 to +537
if (!_PyObject_GC_IS_TRACKED(self)) {
_PyObject_GC_TRACK(self);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it safe to rely on this? The Py_DECREF used in the interpreter does not call _Py_Dealloc:

cpython/Python/ceval.c

Lines 87 to 88 in 35541c4

destructor dealloc = Py_TYPE(op)->tp_dealloc; \
(*dealloc)(op); \

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, good catch. I'd say it's not safe given that. I believe Py_DECREF used to always call _Py_Dealloc but that's no longer the case. Adding my trashcan logic to this code path is going to be a bit of a perf hit (mostly due to the IS_GC check).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I've fixed this by having the Py_DECREF macro in ceval.c call _Py_Dealloc. That might be too much of a perf hit though. A bit cheaper would be to do the GC untrack if required but not check the trashcan depth from that macro.

@rhettinger rhettinger removed their request for review September 30, 2024 23:15
Calling tp_dealloc directly will bypass the trashcan protection and will
not preserve the requirement that GC objects are untracked before
tp_dealloc is called.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants