Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP bpo-44800: Rename _PyInterpreterFrame to _Py_framedata #27525

Closed

Conversation

ncoghlan
Copy link
Contributor

@ncoghlan ncoghlan commented Aug 1, 2021

Superseded by #31987


bpo-44590 significantly improved execution speed by delaying creation
of full Python objects for frames until they were needed for tracing
or other introspection purposes.

This follow-up commit doesn't include any further functional changes,
it just updates variable, attribute, and function names to make it
less ambiguous as to whether different sections of code are dealing
with frames as full Python objects or working directly with the new
lighter weight runtime frame data storage (C structs with no intrinsic
instance lifecycle management).

https://bugs.python.org/issue44800

bpo-44590 significantly improved execution speed by delaying creation
of full Python objects for frames until they were needed for tracing
or other introspection purposes.

This follow-up commit doesn't include any further functions changes,
it just changes variable, attribute, and function names to make it
less ambiguous as to whether different sections of code are dealing
with the classic introspection frames (full Python objects) or the
new lighter weight execution frames (C structs with no intrinsic
instance lifecycle management).
* executing ordinary functions), by a generator or coroutine object (for
* frames that are able to be suspended), or by their corresponding
* introspection frame (if an instrospection API has been invoked and the
* introspection frame created).
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@markshannon This new comment needs review to make sure I've accurately captured the way the new frame implementation works.

Copy link
Member

@markshannon markshannon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR seems to be 3 PRs in one:
1 Fixes to gdbinit
2 Changing the name of InterpreterFrame to _PyExecFrame
3 Adorning many fields with "x"

  1. Thanks for the fix. Would you put the fixes to gdbinit in its own PR, please.
    Do you have any ideas on how to add tests for this, to avoid anyone (probably me 🙂) breaking it again?

  2. Naming, the first hard problem in computer science! I think "Exec" doesn't really convey any more information than "Interpreter". Also, there should be probably be no "_Py" prefix, as that weakly implies that _PyExecFrame is a PyObject, which it is not.
    How about the old fashioned term "activation record"? The term "frame" is now so overloaded.

  3. I left the original names mostly unchanged for a reason. If the value of the attribute, e.g. f_lasti is unchanged, then it makes porting code easier if the name is unchanged. Which is why f_back becomes previous, because the value has changed. I'm sure there are some clarifications that could be made here, but could they go in a PR after we have decided on a name?

Finally, I'd avoid the term "introspection frame". From the point of view of Python code, the frame object is the frame, not just a view of it. Maybe phrase any comments in terms of the InterpreterFrame/_PyExecFrame/ActivationRecord being a storage optimization of the frame?

I believe that @pablogsal has been looking into writing up the way that the call stack now works, so he might have some thoughts on this.

@bedevere-bot
Copy link

When you're done making the requested changes, leave the comment: I have made the requested changes; please review again.

@ncoghlan
Copy link
Contributor Author

The main naming alternative that occurred to me is "frame" and "frame data".

As far as porting goes, the extra layer of indirection is likely to be the biggest hassle, and a one time porting burden for code poking around in frame internals doesn't seem like a compelling reason to tolerate ongoing ambiguity across the code base as to which kind of structure is being used.

@ncoghlan
Copy link
Contributor Author

I posted the gist of the next iteration I'm going to try to bpo issue (including checking to see if the code still feels unambiguous if the variable names are changed while leaving the struct field names alone): https://bugs.python.org/issue44800#msg399381

For gdbinit, it's just an example file that has a few other problems with it at the moment, so I think it's OK to settle on a final field naming convention before fixing the affected pieces of it.

@markshannon
Copy link
Member

As far as porting goes, the extra layer of indirection is likely to be the biggest hassle, and a one time porting burden for code poking around in frame internals doesn't seem like a compelling reason to tolerate ongoing ambiguity across the code base as to which kind of structure is being used.

Where is the ambiguity? There are no field names shared between InterpreterFrame and PyFrameObject.

@ncoghlan
Copy link
Contributor Author

Where is the ambiguity? There are no field names shared between InterpreterFrame and PyFrameObject.

That is the ambiguity: situations where the only quick way to tell which kind of object you had was to have all the field names on each type memorised and infer the type from that, rather than from the types using different variable or field naming conventions. (Relying on the type declarations isn't good, as many of the affected functions are so long that most diffs in code reviews won't have the declaration readily accessible)

@ncoghlan ncoghlan changed the title bpo-44800: Clearly distinguish execution & introspection frames bpo-44800: Rename InterpreterFrame to _Py_framedata Aug 14, 2021
int stackdepth; /* Depth of value stack */
int nlocalsplus;
PyFrameState state; /* What state the frame is in */
PyObject *stack[1];
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file name change obscures the changes to the field names here, where the following previously had the f_ prefix:

  • globals
  • builtins
  • locals
  • code
  • lasti
  • state

To reduce the diff size, while still ensuring that the field prefix was consistently different between PyFrameObject and _Py_framedata, I went with the option of standardising the data struct on "no field prefix" (it was previously split 50/50 between that and the f_ prefix), rather than introducing a new prefix the way I did in the original PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted this file rename from the PR - the internal header now remains pycore_frame.h, with both the full frame object and frame data struct field definitions in it.

@ncoghlan
Copy link
Contributor Author

@markshannon I have made the requested changes; please review again.

I've updated the PR title and description, since the proposal has changed based on your initial review comments.

I quite like the way the now proposed "frame" (PyFrameObject) vs "frame data" (_Py_framedata) naming convention looks in the code.

For the field prefix selection on the frame data struct, of the 12 fields, 6 didn't have a prefix at all, while 6 shared the f_ prefix with PyFrameObject. Rather than the "maximum churn" option of my original PR (which affected all 12 fields by introducing the xf_ prefix), the latest iteration instead standardises on "no field prefix" for the frame data structs.

@bedevere-bot
Copy link

Thanks for making the requested changes!

@markshannon: please review the changes made to this pull request.

variable names ``frame`` and ``f``) and the new ``_Py_framedata`` internal
frame data storage (C structs with no intrinsic instance lifecycle management)
that is now used for code execution (function prefix ``_Py_framedata``, no
field prefix, typical variable name ``fdata``).
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blurb details and date still need to be updated

@@ -157,7 +157,7 @@ number of the current instruction changes. Re-computing the current line for
every instruction is a little slow, though, so each time we compute the line
number we save the bytecode indices where it's valid:

*instr_lb <= frame->f_lasti < *instr_ub
*instr_lb <= frame->lasti < *instr_ub
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like the What's New, this snippet still isn't right, but is also wrong on main

@ncoghlan
Copy link
Contributor Author

ncoghlan commented Mar 12, 2022

It's not currently clear why building and testing is current working locally on Fedora 35, but freezing importlib is failing when the CI goes to check that the generated files are all up to date.

Debug vs non-debug build issue (debug builds are unhappy at the moment)

@ncoghlan
Copy link
Contributor Author

Partially tracked down the debug build failure - somehow PyImport_Import is being called without the Python thread state being set up correctly first while freezing modules, and there's a PyEval_GetGlobals() call in there which isn't being correctly checked for errors (which can happen when there's no thread state set). In a release build, that error was being silently ignored, in a debug build, an assertion caught it. I've added the missing error check so that now always fails (that change should really be pulled out into its own issue report and PR...)

Still not clear how any of the name changes could have affected when the frame state gets populated, though.

@ncoghlan
Copy link
Contributor Author

As far as I can tell, it's actually legitimate to have no active Python frame when calling PyImport_Import(), as the root_cframe struct doesn't automatically get a Python frame allocated (that doesn't happen until the Python eval loop starts running).

So while I'm still not clear on why this branch is hitting that issue when the main branch doesn't, explicitly ignoring the error seems like the right thing to do.

@ncoghlan
Copy link
Contributor Author

Closing this one, see #31987 for a scaled back version of the proposed change that leaves local variable and function parameter names alone.

if (current_frame == NULL) {
_Py_framedata *fdata = tstate->cframe->current_frame;
if (fdata == NULL) {
_PyErr_SetString(tstate, PyExc_SystemError, "cannot get globals; frame data does not exist");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, this was the origin of the PyImport_ImportModule failure - in main, failing to get the locals sets an error, but failing to get the globals doesn't.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants