-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-112287: Speed up Tier 2 (uop) interpreter a little #112286
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The code generator now generates code to access 'operand' as needed. (Alas, this needs to be changed for the JIT.) This speeds the spectral_norm benchmark up by 3-4% on my Intel Mac (but not using --enable-optimizations (PGO/LTO)).
gvanrossum
changed the title
SPeed up Tier 2 (uop) interpreter abut 3%
Speed up Tier 2 (uop) interpreter about 3%
Nov 20, 2023
This prepares us for the JIT template, hopefully.
gvanrossum
changed the title
Speed up Tier 2 (uop) interpreter about 3%
gh-112287: Speed up Tier 2 (uop) interpreter about 3%
Nov 20, 2023
gvanrossum
changed the title
gh-112287: Speed up Tier 2 (uop) interpreter about 3%
gh-112287: Speed up Tier 2 (uop) interpreter a little
Nov 20, 2023
Based on off-line discussion I assume I can just merge this once the tests pass. |
aisk
pushed a commit
to aisk/cpython
that referenced
this pull request
Feb 11, 2024
…12286) This makes the Tier 2 interpreter a little faster. I calculated by about 3%, though I hesitate to claim an exact number. This starts by doubling the trace size limit (to 512), making it more likely that loops fit in a trace. The rest of the approach is to only load `oparg` and `operand` in cases that use them. The code generator know when these are used. For `oparg`, it will conditionally emit ``` oparg = CURRENT_OPARG(); ``` at the top of the case block. (The `oparg` variable may be referenced multiple times by the instructions code block, so it must be in a variable.) For `operand`, it will use `CURRENT_OPERAND()` directly instead of referencing the `operand` variable, which no longer exists. (There is only one place where this will be used.)
Glyphack
pushed a commit
to Glyphack/cpython
that referenced
this pull request
Sep 2, 2024
…12286) This makes the Tier 2 interpreter a little faster. I calculated by about 3%, though I hesitate to claim an exact number. This starts by doubling the trace size limit (to 512), making it more likely that loops fit in a trace. The rest of the approach is to only load `oparg` and `operand` in cases that use them. The code generator know when these are used. For `oparg`, it will conditionally emit ``` oparg = CURRENT_OPARG(); ``` at the top of the case block. (The `oparg` variable may be referenced multiple times by the instructions code block, so it must be in a variable.) For `operand`, it will use `CURRENT_OPERAND()` directly instead of referencing the `operand` variable, which no longer exists. (There is only one place where this will be used.)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This makes the Tier 2 interpreter a little faster.
I calculated by about 3%,
though I hesitate to claim an exact number.
This starts by doubling the trace size limit (to 512),
making it more likely that loops fit in a trace.
The rest of the approach is to only load
oparg
andoperand
in cases that use them.The code generator know when these are used.
For
oparg
, it will conditionally emitat the top of the case block.
(The
oparg
variable may be referenced multiple timesby the instructions code block, so it must be in a variable.)
For
operand
, it will useCURRENT_OPERAND()
directlyinstead of referencing the
operand
variable,which no longer exists.
(There is only one place where this will be used.)