-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-40941: Unify implicit and explicit state in the frame and generator objects into a single value. #20803
bpo-40941: Unify implicit and explicit state in the frame and generator objects into a single value. #20803
Conversation
🤖 New build scheduled with the buildbot fleet by @markshannon for commit 867427e269564d3b16b7c5580343556cbfdcad7c 🤖 If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again. |
Include/cpython/frameobject.h
Outdated
FRAME_RETURNED = 1, | ||
FRAME_RAISED = 2, | ||
FRAME_CLEARED = 3 | ||
} PyFrameState; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't this default to the size of an int, rather than a char? I'm not yet sure whether the previous explicit sizing on some variables was actually useful, or just lost to alignment, but it seems like a signed char should still be more than enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed it to a signed char
Objects/frameobject.c
Outdated
/* Forbid jumps upon a 'return' trace event (except after executing a | ||
* YIELD_VALUE or YIELD_FROM opcode, f_stacktop is not NULL in that case) | ||
* YIELD_VALUE or YIELD_FROM opcode) | ||
* and upon an 'exception' trace event. | ||
* Jumps from 'call' trace events have already been forbidden above for new | ||
* frames, so this check does not change anything for 'call' events. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does "above" refer to the code just deleted, and no longer visible to future readers? Can this comment be reworded to something with fewer separate branching conditions, such as "Jumps can only be made via a trace function triggered by a line trace event on a currently executing or suspended frame" I am assuming that this is not actually a new restriction that needs to be documented, but that is an assumption; I won't be shocked if I missed something outside the diff.
"can only jump from a 'line' trace event"); | ||
return -1; | ||
return -1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is a "call" trace also a "line" trace? If not, it would make sense to combine the 4 invalid states into a single error message, and it might make sense to include the invalid state as an exception attribute, or even as part of the string. if that is much easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No a "call" trace is not a "line" trace.
case FRAME_EXECUTING: | ||
case FRAME_SUSPENDED: | ||
/* You can only do this from within a trace function, not via | ||
* _getframe or similar hackery. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this restriction? Is it just because "better safe than sorry, and we couldn't add the restriction later"? Is it an intentional limitation to be kind to PyPy and optimization?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It a pre-existing restriction. There is no change to behavior here.
} | ||
f->f_stackdepth = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I almost want to change this to take advantage of f_stackdepth already being an integer count of the number of times to loop, but maybe that is too clever for debugging, and should be left to the compiler.
while (f->f_stackdepth--) Py_XDRECREF(f->f_valuestack[f->f_stackdepth]) ;
seems not worse, but if I missed something by not testing, that argues for not making similar changes. :{
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we should leave optimizing the loop to the compiler, although using idiomatic forms usually helps the compiler.
The two forms are equivalent.
The standard for
loop from 0 to the limit is idiomatic C and easier to read.
err = gen_close_iter(yf); | ||
gen->gi_running = 0; | ||
gen->gi_frame->f_state = state; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the previous state was not FRAME_SUSPENDED, what would that mean, and should the generator really return to that state? (applies several places) I had sort of thought that the only backwards transitions were FRAME_EXECUTING to FRAME_SUSPENDED, and FRAME_CLEARED to FRAME_CREATED when the memory was reallocated to a different frame.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also think that the state must be FRAME_SUSPENDED
, but if were some other state were possible this code would still be correct. So I'm inclined to leave it as is.
static PyGetSetDef gen_getsetlist[] = { | ||
{"__name__", (getter)gen_get_name, (setter)gen_set_name, | ||
PyDoc_STR("name of the generator")}, | ||
{"__qualname__", (getter)gen_get_qualname, (setter)gen_set_qualname, | ||
PyDoc_STR("qualified name of the generator")}, | ||
{"gi_yieldfrom", (getter)gen_getyieldfrom, NULL, | ||
PyDoc_STR("object being iterated by yield from, or None")}, | ||
{"gi_running", (getter)gen_getrunning, NULL, NULL}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this reordering break any binary guarantees for any of the types?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. This is opaque to C extensions.
Having f_stackdepth <= 0 ensures that invalid | ||
values are not visible to the cycle GC. | ||
We choose -1 rather than 0 to assist debugging. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The GC doesn't seem to have been modified; does it really pay attention to stackdepth, as opposed to the actual pointer that was there previously?
Python/ceval.c
Outdated
if (tstate->c_tracefunc != NULL) { | ||
/* Temporarily set frame state to RAISED for benefit of frame.setlineno */ | ||
assert(f->f_state == FRAME_EXECUTING); | ||
f->f_state = FRAME_RAISED; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems I missed another backwards state transition. Maybe it would be good to explain the state machine where the states names are defined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit of a hack, to keep frame.setlineno
happy.
I've added FRAME_UNWINDING
to mark that an exception is being unwound in the frame, but that the frame has not completed yet.
Include/cpython/frameobject.h
Outdated
@@ -4,6 +4,16 @@ | |||
# error "this header file must not be included directly" | |||
#endif | |||
|
|||
/* These values are chosen so that all tests involve comparing to zero. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"all" tests?
I think this means ">0 implies finished; <=0 implies not-yet-finished", but I'm inferring that from the enum values and not the comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, making the specification explicit would be nice.
Include/cpython/frameobject.h
Outdated
PyTryBlock f_blockstack[CO_MAXBLOCKS]; /* for try and loop blocks */ | ||
PyObject *f_localsplus[1]; /* locals+stack, dynamically sized */ | ||
}; | ||
|
||
static inline int _PyFrameIsRunnable(struct _frame *f) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_PyFrame_IsRunnable
?
Or maybe _PyFrame_IS_RUNNABLE
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to _PyFrame_IsRunnable
. I think a similar format should be used with the other functions as well, that naming style fits better with existing members.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_PyFrame_IsRunnable
seem more in keeping with the rest of the code base. So I've gone with that.
if (f && f->f_stacktop) { | ||
if (f) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to replace the && f->f_stacktop
with && f->f_stackdepth >=0
or at least an assert(f->f_stackdepth >= 0)
on line 329?
Admittedly, I'm a bit out of my depth when it come to the low-level generator details, but at a glance it seems to me like we would not want to start the yield from if the generator's frame has a stack depth of -1 (or lower, somehow?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In theory, the stack depth can never be negative. There's an assertion on line 344 that it is positive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I saw that assertion on line 344. If it's realistically not going to be negative, we probably only need the assertion prior to modifying the stack depth (particularly for decrements).
Objects/genobject.c
Outdated
/* Pop subiterator from stack */ | ||
ret = *(--gen->gi_frame->f_stacktop); | ||
gen->gi_frame->f_stackdepth--; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we have an assert(gen->gi_frame->f_stackdepth >= 0)
sanity check here after decrementing the stack depth? I don't know that it's necessary here, but since it gets removed with -O
anyways it won't incur a real cost.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to self: assert(gen->gi_frame->f_stackdepth > 0);
was added before decrement in a22412c.
return f->f_state == FRAME_EXECUTING; | ||
} | ||
|
||
static inline int _PyFrameHasCompleted(struct _frame *f) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just something to be vary about here...there was recently https://bugs.python.org/issue39542#msg372962 where a static inline
function failed to inline on mac, albeit it was through a more complicated call chain.
Not sure if you have the capability to test this but it'd be nice if someone can benchmark this on mac since this is such a critical code path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should trust the big three compilers to do their jobs without constantly verifying it.
Clang will inline this very simple function, probably even if we don't tell it to.
Include/cpython/frameobject.h
Outdated
@@ -4,6 +4,16 @@ | |||
# error "this header file must not be included directly" | |||
#endif | |||
|
|||
/* These values are chosen so that all tests involve comparing to zero. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, making the specification explicit would be nice.
Objects/genobject.c
Outdated
gen_getrunning(PyGenObject *gen, void *Py_UNUSED(ignored)) | ||
{ | ||
if (gen->gi_frame == NULL) { | ||
Py_INCREF(Py_False); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Py_RETURN_FALSE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Objects/genobject.c
Outdated
cr_getrunning(PyCoroObject *coro, void *Py_UNUSED(ignored)) | ||
{ | ||
if (coro->cr_frame == NULL) { | ||
Py_INCREF(Py_False); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Py_RETURN_FALSE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
…to read and saves a word of memory.
593d935
to
38792f2
Compare
Thanks for the reviews. I think I've addressed all the comments. |
ret = _gen_throw((PyGenObject *)yf, close_on_genexit, | ||
typ, val, tb); | ||
tstate->frame = f; | ||
gen->gi_running = 0; | ||
gen->gi_frame->f_state = state; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: for symmetry, I think the new modified line would be better before the tstate->frame = f;
line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
…or objects into a single value. (pythonGH-20803) * Merge gen and frame state variables into one. * Replace stack pointer with depth in PyFrameObject. Makes code easier to read and saves a word of memory.
…or objects into a single value. (pythonGH-20803) * Merge gen and frame state variables into one. * Replace stack pointer with depth in PyFrameObject. Makes code easier to read and saves a word of memory.
…or objects into a single value. (pythonGH-20803) * Merge gen and frame state variables into one. * Replace stack pointer with depth in PyFrameObject. Makes code easier to read and saves a word of memory.
…or objects into a single value. (pythonGH-20803) * Merge gen and frame state variables into one. * Replace stack pointer with depth in PyFrameObject. Makes code easier to read and saves a word of memory.
Merges the explicit state in
PyFrameObject.f_excuting
andPyGenObject.gi_running
, and the implicit state inPyFrameObject.f_stacktop == NULL
andPyFrameObject.f_last == -1
into a single enum field inPyFrameObject
Since we no longer need to test
PyFrameObject.f_stacktop == NULL
, Thef_stacktop
pointer can be replaced with af_stackdepth
integer, which makes for simpler code when iterating over the stack and avoids the potential hazard of NULL pointers.There are three benefits to these changes:
return
or byraise
is available.https://bugs.python.org/issue40941