-
-
Notifications
You must be signed in to change notification settings - Fork 31.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-33234 Improve list() pre-sizing for inputs with known lengths #6493
Conversation
55b6bd9
to
8703946
Compare
Lib/test/test_list.py
Outdated
@@ -157,5 +157,12 @@ class L(list): pass | |||
with self.assertRaises(TypeError): | |||
(3,) + L([1,2]) | |||
|
|||
def test_preallocation(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be marked as a CPython only test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rhettinger Amended in 7602879
The list() constructor isn't taking full advantage of known input lengths or length hints. This commit makes the constructor pre-size and not over-allocate when the input size is known or can be reasonably estimated. A new test is added to test_list.py and a test needed to be modified in test_descr.py as it assumed that there is only one call to __length_hint__() in the list constructor.
@@ -2649,6 +2675,13 @@ list___init___impl(PyListObject *self, PyObject *iterable) | |||
(void)_list_clear(self); | |||
} | |||
if (iterable != NULL) { | |||
Py_ssize_t iter_len = PyObject_LengthHint(iterable, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
list_extend() uses PyObject_LengthHint(iterable, 8): we may also use 8 here.
I have collected more metrics. This are the L1 and L2 cache misses and related information: THIS PR
MASTER
As you can see, this shows a great improvement in cache efficiency and branch predictions. As there is some concern regarding what happens with the overhead of calling
And these are the results:
As you can see the only case when this patch is slow is for very small iterables without |
CC: @rhettinger |
I'm not sure about this change because of the possible slowdown for short datasets. Raymond and me proposed to add a slot for length hint. I would suggest to work in two steps:
|
Agreed. Will open a new PR with the version that only uses lenght if it is available. |
The list() constructor isn't taking full advantage of known input
lengths or length hints. This commit makes the constructor
pre-size and not over-allocate when the input size is known or
can be reasonably estimated.
A new test is added to
test_list.py
and a test needed to be modifiedin
test_descr.py
as it assumed that there is only one call to__length_hint__()
in the list constructor.https://bugs.python.org/issue33234