-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data structures and code for exits from executors. #644
Comments
Uh, why?
Why not dynamically allocated based on the size of the trace? (Which we know before we create the executor object (in I take it that the targets are basically moved from the
I'm missing something here. My understanding was that exits are (largely) reached by calls to |
Because at least half will be
There will 256 globally (per-process) as they are
Each |
The op(_COLD_EXIT, (--)) {
TIER_TWO_ONLY
_PyExitData *exit = current_executor->exits[oparg];
Py_DECREF(current_executor);
exit->hotness++;
UNLIKELY_EXIT_IF(exit->hotness < 0);
PyCodeObject *code = _PyFrame_GetCode(frame);
_Py_CODEUNIT *target = _PyCode_CODE(code) + exit->target;
_PyOptimizerObject *opt = interp->optimizer;
_PyExecutorObject *executor = NULL;
int optimized = opt->optimize(opt, code, target, &executor, (int)(stack_pointer - _PyFrame_Stackbase(frame)));
ERROR_IF(optimized < 0, error);
if (optimized) {
exit->executor = executor;
Py_INCREF(executor);
GOTO_TIER_TWO();
}
else {
exit->hotness = -10000; /* Choose a better number */
}
} |
And the code for _PyExitData *exit = current_executor->exits[index];
Py_INCREF(exit->exit);
Py_DECREF(current_executor);
current_executor = exit->executor;
GOTO_TIER_TWO();
|
Oh, I see what I didn't get. The (dynamically allocated) array of exit structs is part of the executor, and each possibly-deoptimizing uop in the However, the When the hotness counter warms up, eventually the I'm not sure I follow the reasoning about how many cold exit executors we need (initially 1/3rd of the uops are PS 1: I take it that the PS 2: Maybe the field referencing an executor in the exit struct should be called |
No one need own the executor, we have reference counting. Each
1/2 is an approximation.
Yes. |
Makes sense (as does the rest). The destructor for executors needs to know how many exit structs an executor has, and decref the executors in those exits. The static ones can be immortal. |
I would still plead for renaming the |
Already done 🙂 |
@markshannon Presumably before we can work on this, we need to implement having separate structs for uops during the trace collection/optimization chain and uops in the executor, right? Because in the executor the 'target' field lives in the exit blocks, but during optimization, 'target' is part of the instruction struct. |
Done 🎉 |
To support trace stitching we need data for exits embedded into trace objects.
First of all we need an exit struct:
https://github.com/faster-cpython/ideas/blob/main/3.13/engine.md#making-cold-exits-small lists the fields we need.
Since all executors are now micro-ops based, we can get rid of the distinction between
_PyExecutorObject
and_PyUOpExecutorObject
.The new combined executor struct would look something like:
We will pre-allocate an array of executors for cold exits, one for each possible index into the
exits
array. Since there cannot be more exits than the maximum number ofuops
, the size of the array will be_Py_UOP_MAX_TRACE_LENGTH
, currently 512.Cold exits will be implemented as a single micro-op
_COLD_EXIT
withoparg
set to the index in the array._COLD_EXIT
needs to do the following:exit = current_executor->exit[oparg]
exit->hotness
exit->hotness == 0
then trigger generator on a new executor to replaceexit->executor
.next_instr = PyCode_CODE(code) + exit->target; ENTER_TIER1();
The JIT will need to compile the code for these cold exits on start up, but we can static allocate the data structures.
The text was updated successfully, but these errors were encountered: