-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-40255: Implement Immortal Instances - Optimization 3 #31490
bpo-40255: Implement Immortal Instances - Optimization 3 #31490
Conversation
_Py_SetImmortal(code->co_names); | ||
_Py_SetImmortal(code->co_varnames); | ||
_Py_SetImmortal(code->co_freevars); | ||
_Py_SetImmortal(code->co_cellvars); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make contents of these tuples immortal too.
Especially, co_consts is important, because LOAD_CONST will INCREF.
false_refcount = sys.getrefcount(False) | ||
smallint_refcount = sys.getrefcount(100) | ||
|
||
# Assert that all of these immortal instances have large ref counts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Assert that all of these immortal instances have large ref counts | |
# Assert that all of these immortal instances have large ref counts. |
self.assertGreater(false_refcount, 1e8) | ||
self.assertGreater(smallint_refcount, 1e8) | ||
|
||
# Confirm that the refcount doesn't change even with a new ref to them |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Confirm that the refcount doesn't change even with a new ref to them | |
# Confirm that the refcount doesn't change even with a new ref to them. |
} | ||
|
||
_Py_SetImmortal(obj); | ||
/* Special case for PyCodeObjects since they don't have a tp_traverse */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/* Special case for PyCodeObjects since they don't have a tp_traverse */ | |
// Special case for PyCodeObjects since they don't have a tp_traverse. |
@@ -1994,7 +1995,9 @@ _Py_NewReference(PyObject *op) | |||
#ifdef Py_REF_DEBUG | |||
_Py_RefTotal++; | |||
#endif | |||
Py_SET_REFCNT(op, 1); | |||
/* Do not use Py_SET_REFCNT to skip the Immortal Instance check. This | |||
* API guarantees that an instance will always be set to a refcnt of 1 */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* API guarantees that an instance will always be set to a refcnt of 1 */ | |
* API guarantees that an instance will always be set to a refcnt of 1. */ |
@@ -1829,6 +1829,10 @@ PyImport_ImportModuleLevelObject(PyObject *name, PyObject *globals, | |||
if (mod == NULL) { | |||
goto error; | |||
} | |||
// Immortalize top level modules |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Immortalize top level modules | |
// Immortalize top level modules. |
Immortalizing Modules After Import
This is an optimization on top of PR19474.
The improvement here uses the assumption that top-level modules and its globals will (most likely) be alive during the entire lifecycle of the runtime. Thus, every time that a top level import happens, it will now immortalize the module and its contents.
Benchmark Results
Overall: 1% slower compared to the master branch
pyperformance results
2to3: Mean +- std dev: [cpython_master] 432 ms +- 15 ms -> [immortal_instances_opt3] 447 ms +- 14 ms: 1.04x slower chaos: Mean +- std dev: [cpython_master] 126 ms +- 4 ms -> [immortal_instances_opt3] 119 ms +- 4 ms: 1.06x faster django_template: Mean +- std dev: [cpython_master] 62.2 ms +- 2.0 ms -> [immortal_instances_opt3] 63.3 ms +- 2.4 ms: 1.02x slower float: Mean +- std dev: [cpython_master] 128 ms +- 4 ms -> [immortal_instances_opt3] 135 ms +- 5 ms: 1.05x slower go: Mean +- std dev: [cpython_master] 244 ms +- 10 ms -> [immortal_instances_opt3] 227 ms +- 6 ms: 1.07x faster hexiom: Mean +- std dev: [cpython_master] 11.5 ms +- 0.6 ms -> [immortal_instances_opt3] 11.3 ms +- 0.3 ms: 1.02x faster html5lib: Mean +- std dev: [cpython_master] 97.9 ms +- 4.2 ms -> [immortal_instances_opt3] 103 ms +- 6 ms: 1.06x slower json_dumps: Mean +- std dev: [cpython_master] 19.2 ms +- 0.7 ms -> [immortal_instances_opt3] 19.8 ms +- 0.5 ms: 1.03x slower logging_format: Mean +- std dev: [cpython_master] 10.4 us +- 0.3 us -> [immortal_instances_opt3] 11.1 us +- 0.5 us: 1.06x slower nqueens: Mean +- std dev: [cpython_master] 159 ms +- 5 ms -> [immortal_instances_opt3] 154 ms +- 3 ms: 1.04x faster pathlib: Mean +- std dev: [cpython_master] 28.5 ms +- 0.7 ms -> [immortal_instances_opt3] 27.9 ms +- 0.9 ms: 1.02x faster pickle: Mean +- std dev: [cpython_master] 16.0 us +- 0.5 us -> [immortal_instances_opt3] 15.6 us +- 0.5 us: 1.03x faster pickle_dict: Mean +- std dev: [cpython_master] 37.3 us +- 0.6 us -> [immortal_instances_opt3] 36.2 us +- 0.8 us: 1.03x faster pickle_pure_python: Mean +- std dev: [cpython_master] 572 us +- 14 us -> [immortal_instances_opt3] 581 us +- 15 us: 1.02x slower pidigits: Mean +- std dev: [cpython_master] 284 ms +- 15 ms -> [immortal_instances_opt3] 276 ms +- 6 ms: 1.03x faster pyflate: Mean +- std dev: [cpython_master] 770 ms +- 28 ms -> [immortal_instances_opt3] 760 ms +- 21 ms: 1.01x faster python_startup: Mean +- std dev: [cpython_master] 12.6 ms +- 0.4 ms -> [immortal_instances_opt3] 12.1 ms +- 0.5 ms: 1.04x faster python_startup_no_site: Mean +- std dev: [cpython_master] 8.89 ms +- 0.39 ms -> [immortal_instances_opt3] 8.55 ms +- 0.45 ms: 1.04x faster raytrace: Mean +- std dev: [cpython_master] 529 ms +- 16 ms -> [immortal_instances_opt3] 540 ms +- 19 ms: 1.02x slower regex_compile: Mean +- std dev: [cpython_master] 233 ms +- 6 ms -> [immortal_instances_opt3] 241 ms +- 5 ms: 1.03x slower regex_dna: Mean +- std dev: [cpython_master] 239 ms +- 6 ms -> [immortal_instances_opt3] 252 ms +- 4 ms: 1.06x slower regex_effbot: Mean +- std dev: [cpython_master] 4.53 ms +- 0.12 ms -> [immortal_instances_opt3] 4.74 ms +- 0.11 ms: 1.05x slower regex_v8: Mean +- std dev: [cpython_master] 33.2 ms +- 0.8 ms -> [immortal_instances_opt3] 34.1 ms +- 1.0 ms: 1.03x slower richards: Mean +- std dev: [cpython_master] 82.8 ms +- 3.7 ms -> [immortal_instances_opt3] 85.9 ms +- 3.8 ms: 1.04x slower scimark_fft: Mean +- std dev: [cpython_master] 571 ms +- 12 ms -> [immortal_instances_opt3] 623 ms +- 23 ms: 1.09x slower scimark_lu: Mean +- std dev: [cpython_master] 195 ms +- 6 ms -> [immortal_instances_opt3] 207 ms +- 7 ms: 1.06x slower scimark_monte_carlo: Mean +- std dev: [cpython_master] 116 ms +- 5 ms -> [immortal_instances_opt3] 120 ms +- 3 ms: 1.03x slower scimark_sor: Mean +- std dev: [cpython_master] 211 ms +- 6 ms -> [immortal_instances_opt3] 216 ms +- 7 ms: 1.02x slower scimark_sparse_mat_mult: Mean +- std dev: [cpython_master] 8.28 ms +- 0.40 ms -> [immortal_instances_opt3] 8.56 ms +- 0.22 ms: 1.03x slower sympy_expand: Mean +- std dev: [cpython_master] 878 ms +- 34 ms -> [immortal_instances_opt3] 831 ms +- 24 ms: 1.06x faster sympy_integrate: Mean +- std dev: [cpython_master] 35.2 ms +- 1.0 ms -> [immortal_instances_opt3] 36.2 ms +- 1.9 ms: 1.03x slower sympy_str: Mean +- std dev: [cpython_master] 514 ms +- 11 ms -> [immortal_instances_opt3] 521 ms +- 18 ms: 1.01x slower unpickle: Mean +- std dev: [cpython_master] 21.5 us +- 0.7 us -> [immortal_instances_opt3] 21.8 us +- 0.5 us: 1.02x slower unpickle_pure_python: Mean +- std dev: [cpython_master] 463 us +- 17 us -> [immortal_instances_opt3] 446 us +- 14 us: 1.04x faster xml_etree_parse: Mean +- std dev: [cpython_master] 245 ms +- 6 ms -> [immortal_instances_opt3] 239 ms +- 8 ms: 1.02x faster xml_etree_generate: Mean +- std dev: [cpython_master] 146 ms +- 4 ms -> [immortal_instances_opt3] 149 ms +- 4 ms: 1.02x slowerBenchmark hidden because not significant (15): deltablue, fannkuch, json_loads, logging_silent, logging_simple, meteor_contest, nbody, pickle_list, spectral_norm, sympy_sum, telco, unpack_sequence, unpickle_list, xml_etree_iterparse, xml_etree_process
Geometric mean: 1.01x slower
Implementation Details
Any time that a new module import happens in
PyImport_ImportModuleLevelObject
, we now check the current depth of the python stack. If we are at the top level, it then immortalizes the module as well as all the transitive dependencies. Not only that but it also pushes all the found containers into the permanent GC generation since none of these objects should never be collected again.Module Finalization
This skips over the correct finalization of the immortalized modules since that’s already handled in PR31489.
https://bugs.python.org/issue40255