gh-112287: Speed up Tier 2 (uop) interpreter a little #112286

gvanrossum · 2023-11-20T16:23:35Z

This makes the Tier 2 interpreter a little faster.
I calculated by about 3%,
though I hesitate to claim an exact number.

This starts by doubling the trace size limit (to 512),
making it more likely that loops fit in a trace.

The rest of the approach is to only load
oparg and operand in cases that use them.
The code generator know when these are used.

For oparg, it will conditionally emit

oparg = CURRENT_OPARG();

at the top of the case block.
(The oparg variable may be referenced multiple times
by the instructions code block, so it must be in a variable.)

For operand, it will use CURRENT_OPERAND() directly
instead of referencing the operand variable,
which no longer exists.
(There is only one place where this will be used.)

Issue: Speed up the Tier 2 interpreter #112287

The code generator now generates code to access 'operand' as needed. (Alas, this needs to be changed for the JIT.) This speeds the spectral_norm benchmark up by 3-4% on my Intel Mac (but not using --enable-optimizations (PGO/LTO)).

This prepares us for the JIT template, hopefully.

gvanrossum · 2023-11-20T18:38:57Z

Based on off-line discussion I assume I can just merge this once the tests pass.

…12286) This makes the Tier 2 interpreter a little faster. I calculated by about 3%, though I hesitate to claim an exact number. This starts by doubling the trace size limit (to 512), making it more likely that loops fit in a trace. The rest of the approach is to only load `oparg` and `operand` in cases that use them. The code generator know when these are used. For `oparg`, it will conditionally emit ``` oparg = CURRENT_OPARG(); ``` at the top of the case block. (The `oparg` variable may be referenced multiple times by the instructions code block, so it must be in a variable.) For `operand`, it will use `CURRENT_OPERAND()` directly instead of referencing the `operand` variable, which no longer exists. (There is only one place where this will be used.)

gvanrossum added 20 commits November 15, 2023 15:54

Add executor_cases.c.h dependency for ceval.o

a08909d

Clean up flags.py

4c2914b

Clean up parsing.py

053a0a2

Add back printing optimized uops

b838435

Hacky way to make FOR_ITER a viable uop

b28effa

_SPECIALIZE_UNPACK_SEQUENCE is TIER_ONE_ONLY

de8f199

NEWS

5c5d8bd

Double max trace length to 256

36e9ada

Move stuff around to suit the JIT branch

def1830

Merge remote-tracking branch 'origin/main' into for-iter-uop

ce19637

Clean up _FOR_ITER_TIER_TWO using DEOPT_IF(true)

7096818

Add test

5852105

Revert debug change to is_viable_uop()

4ac68b3

Avoid debug-only local variable 'word'

95b1a01

Revert changes to _EXIT_TRACE logic

4c72028

Merge remote-tracking branch 'origin/main' into for-iter-uop

14aea56

Merge branch 'main' into for-iter-uop

88c1701

Double max trace length to 512

3f0df1a

Faster uops: do away with 'operand' variable

b11b8ea

The code generator now generates code to access 'operand' as needed. (Alas, this needs to be changed for the JIT.) This speeds the spectral_norm benchmark up by 3-4% on my Intel Mac (but not using --enable-optimizations (PGO/LTO)).

Do the same for 'oparg' -- another 4% speedup

a2c4f00

gvanrossum changed the title ~~SPeed up Tier 2 (uop) interpreter abut 3%~~ Speed up Tier 2 (uop) interpreter about 3% Nov 20, 2023

Use CURRENT_OPARG() and CURRENT_OPERAND() macros

ddba5ed

This prepares us for the JIT template, hopefully.

gvanrossum changed the title ~~Speed up Tier 2 (uop) interpreter about 3%~~ gh-112287: Speed up Tier 2 (uop) interpreter about 3% Nov 20, 2023

bedevere-app bot mentioned this pull request Nov 20, 2023

Speed up the Tier 2 interpreter #112287

Closed

gvanrossum changed the title ~~gh-112287: Speed up Tier 2 (uop) interpreter about 3%~~ gh-112287: Speed up Tier 2 (uop) interpreter a little Nov 20, 2023

gvanrossum added 2 commits November 20, 2023 10:36

(Theoretically) improve 'Unknown uop' message

0fae4e6

Merge branch 'main' into faster-uops

9ea707b

gvanrossum marked this pull request as ready for review November 20, 2023 18:38

gvanrossum requested a review from markshannon as a code owner November 20, 2023 18:38

bedevere-app bot added the awaiting core review label Nov 20, 2023

gvanrossum added 2 commits November 20, 2023 10:42

Add news

7c1f6d4

Fix backticks (as always)

1d1f146

gvanrossum merged commit 8deb8bc into python:main Nov 20, 2023
29 checks passed

bedevere-app bot removed the awaiting core review label Nov 20, 2023

gvanrossum deleted the faster-uops branch November 20, 2023 19:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gh-112287: Speed up Tier 2 (uop) interpreter a little #112286

gh-112287: Speed up Tier 2 (uop) interpreter a little #112286

gvanrossum commented Nov 20, 2023 •

edited

Loading

gvanrossum commented Nov 20, 2023

gh-112287: Speed up Tier 2 (uop) interpreter a little #112286

gh-112287: Speed up Tier 2 (uop) interpreter a little #112286

Conversation

gvanrossum commented Nov 20, 2023 • edited Loading

gvanrossum commented Nov 20, 2023

gvanrossum commented Nov 20, 2023 •

edited

Loading