Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-99761: add invalid_index macro #99762

Closed
wants to merge 52 commits into from
Closed

Conversation

eendebakpt
Copy link
Contributor

@eendebakpt eendebakpt commented Nov 24, 2022

In listobject.c there is an optimization to check whether an index is valid (e.g. 0 <= index < N) using a single comparison. The cast-to-unsigned optimization is used in current main in 3 locations: tupleobject.c, _collectionsmodule.c and bytesobject.c.

It is a micro optimization, but the code generated is different than the plain i < 0 || i >= Py_SIZE(a). This PR tries to:

i) apply the optimization to more locations
ii) make the code more consistent

By replacing index checks with a single method _Py_is_valid_index that includes the optimization we have consistency in the code and have the optimized check for all index checks.

Notes:

DEOPT_IF(((size_t)signed_magnitude) > 1, BINARY_SUBSCR);

With the _Py_is_valid_index method it looks like

DEOPT_IF(!_Py_is_valid_index(signed_magnitude, 2), BINARY_SUBSCR);

which is in my opinion not very readable. For these cases a separate PR was created: #100064

@sweeneyde
Copy link
Member

I think this is generally a good idea, but I'm not sure this should go in the public API. If it does, its name should probably begin with 'Py...'. See https://devguide.python.org/developer-workflow/c-api/ and https://peps.python.org/pep-0007/#naming-conventions .

cc @vstinner

@eendebakpt
Copy link
Contributor Author

I think this is generally a good idea, but I'm not sure this should go in the public API. If it does, its name should probably begin with 'Py...'. See https://devguide.python.org/developer-workflow/c-api/ and https://peps.python.org/pep-0007/#naming-conventions .

cc @vstinner

Agreed. What would be a good location for the private version? One of the Include/internal/python_xxxx files?

@rhettinger
Copy link
Contributor

  • Keep the code as valid_index rather than invalid_index so that we don't even up with double negatives.

  • Please leave the collections.deque code alone. The author (me) aspires to keep this code not tightly coupled to the rest of the C API. The code is currently clean an doesn't need to be changed.

  • Are you certain that inline functions always get inlined? We know that macros do always get inlined. The valid_index macro currently gets used in performance critical code.

  • As Dennis says, this should not be part of the public API. It is a performance trick that happens to currently be useful. As compilers get better, this would be done automatically and we would prefer the simple inline code rather than the extra layer of abstraction.

@eendebakpt
Copy link
Contributor Author

  • Keep the code as valid_index rather than invalid_index so that we don't even up with double negatives.

Everywhere in the code the usage is !valid_index, which is why I preferred invalid_index. I renamed to valid_index again.

  • Please leave the collections.deque code alone. The author (me) aspires to keep this code not tightly coupled to the rest of
    the C API. The code is currently clean an doesn't need to be changed.

Done.

  • Are you certain that inline functions always get inlined? We know that macros do always get inlined. The valid_index macro currently gets used in performance critical code.

With inline functions I think we can never be certain the code gets inlined for all compilers and settings. I changed the static inline to a macro.

  • As Dennis says, this should not be part of the public API. It is a performance trick that happens to currently be useful. As compilers get better, this would be done automatically and we would prefer the simple inline code rather than the extra layer of abstraction.

The valid_index is now private (part of Includes/internal/pycore_abstract.h). If there is a more suitable location we can move it.

@rhettinger

@eendebakpt eendebakpt marked this pull request as ready for review November 26, 2022 22:12
@vstinner
Copy link
Member

You should not use macro but a static inline functions, see:

@rhettinger:

Are you certain that inline functions always get inlined? We know that macros do always get inlined. The valid_index macro currently gets used in performance critical code.

See PEP 670 which answers to that and we approved by the Steering Council. A static inline function must be used.

Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not very excited by a function only used for a micro-optimization, I'm fine with (i < 0 || i >= Py_SIZE(self)) test. But I'm not against this change neither.

Include/internal/pycore_abstract.h Outdated Show resolved Hide resolved
@eendebakpt
Copy link
Contributor Author

The overall change LGTM, but please address my two remaining change requests.

@vstinner Could you have a look at the PR again?

@arhadthedev arhadthedev added interpreter-core (Objects, Python, Grammar, and Parser dirs) extension-modules C modules in the Modules dir skip news labels Apr 5, 2023
@eendebakpt eendebakpt requested a review from arhadthedev April 28, 2023 19:07
Copy link
Member

@arhadthedev arhadthedev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All earlier reviews seem to be addressed.

@serhiy-storchaka
Copy link
Member

This idea was already discussed and rejected. See #72583.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting core review extension-modules C modules in the Modules dir interpreter-core (Objects, Python, Grammar, and Parser dirs) skip news
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants