Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Yul: Constant-complexity switch statement jump table targeting enums #12964

Closed
jaa2 opened this issue Apr 27, 2022 · 6 comments
Closed

Yul: Constant-complexity switch statement jump table targeting enums #12964

jaa2 opened this issue Apr 27, 2022 · 6 comments
Labels
closed due inactivity The issue/PR was automatically closed due to inactivity. stale The issue/PR was marked as stale because it has been open for too long.

Comments

@jaa2
Copy link
Contributor

jaa2 commented Apr 27, 2022

Abstract

The current switch statement control graph builder, for Yul to EVM specifically, is fine when switching between only a few values. To decide which path to take, a switch statement is turned into a (DUP + PUSH + EQ + PUSH + JUMPI) pattern for each case, and the default case is handled at the end. This is fine when handling up to three values, but the execution gas cost increases linearly with the number of cases. With four or more cases that are handled uniformly, a jump table performs better on average, and we can create a simple jump table that works specifically with switch statements containing all cases from 0...n.

This is what it looks like today for a switch statement with cases 0...4:

DUP1 PUSH1 0x0 EQ PUSH2 0x29C JUMPI
DUP1 PUSH1 0x1 EQ PUSH2 0x121 JUMPI
DUP1 PUSH1 0x2 EQ PUSH2 0x177 JUMPI
DUP1 PUSH1 0x3 EQ PUSH2 0x34D JUMPI
PUSH1 0x4 EQ PUSH2 0x134 JUMPI ...

Motivation

For enums in particular - that is, switch statements that contain cases ranging from 0 to some number, there is a rather straightforward jump table implementation that not only makes the execution gas cost constant, but with more than five cases will also decrease the deployment cost.

Rather than checking each value individually and then branching on every check, in the enum case, we should calculate the jump table offset using the enum value itself and jump there.
It would work something like this:

  • Check if the value to compare is greater than the maximum case value; if so, jump to the default case if it exists, or to the end of the switch statement otherwise
  • Otherwise, calculate the jump table offset: jumptable + 3 * value_to_compare
  • At the jump table offset, jump to the target destination, which is where the code for handling the case is

As it is currently, after the case is handled, jump to the end of the switch statement.

Ultimately, any work done for this could pave the way for more robust dispatchers in the future.

Specification

In the case of a switch statement handling all cases 0...n with n >= 4, turn the current dispatch method for switch statements into a jump table, which should look something like this:

DUP1
PUSH1 max       // Maximum case value for the enum
GT              // Check if the value is within the range of the enum
PUSH2 default_case
JUMPI           // Jump to default case if the value is out of bounds
DUP1
PUSH1 0x3
MUL
PUSH2 jumptable
ADD
JUMP            // Jump to a position in the jump table

jumptable:
JUMPDEST
PUSH2 0x29C
JUMP
JUMPDEST
PUSH2 0x121
JUMP
...

default_case:
...

My pen-and-paper execution gas cost comparison looks like this:

Current state:
Val     Gas
0       DUP + PUSH + EQ + PUSH + JUMPI = 3+3+3+3+10 = 22
1       DUP + PUSH + EQ + PUSH + JUMPI + (DUP + PUSH + EQ + PUSH + JUMPI) = 44
2       66
3       88
4       110

Enum jump table:
Val     Gas
0       DUP + PUSH + GT + PUSH + JUMPI + DUP + PUSH + MUL + PUSH + ADD + JUMP + JUMPDEST + PUSH + JUMP = 59
1       DUP + PUSH + GT + PUSH + JUMPI + DUP + PUSH + MUL + PUSH + ADD + JUMP + JUMPDEST + PUSH + JUMP = 59
2       59
3       59
4       59

Backwards Compatibility

This is a Yul to EVM optimization, so I don't anticipate any backward compatibility issues. That said, this proposal only targets switch statements with 0...n values, with n >= 4.

This could be an optional optimization to be added onto the Yul compiler.

Related: #12650

@chriseth
Copy link
Contributor

There are multiple ways to do this. You can store the jump targets in a "code table" like you did, but we can also store them in code and use codecopy to retrieve it. Furthermore, we could use a push32 to store 16 jump targets and use a shift/mask combination to retrieve the right one. All of them require a change to Assembly in order to store the jump targets at the right page after the assembly phase (the one where the acutal byte offsets of tags are determined).
Would be nice to experiment which way is the best, so your help is really appreciated!

@jaa2
Copy link
Contributor Author

jaa2 commented Apr 28, 2022

I like the idea of storing the jump targets in a single word. I was able to get the execution cost down to 48 gas using the push technique, taking advantage of the fact that a right shift of more than the range of the pushed value will result in zero. If the result of the shift is 0, it jumps to the default case instead.

Push technique (48 gas):

PUSH2 0xffff    // Byte mask
PUSH10 0x0034002e00280022001c   // Every two bytes is a destination
DUP3            // Get enum value
PUSH1 0x04
SHL             // Calculate shift amount: 16 * enum value
SHR
AND             // Apply two-byte mask
DUP1
ISZERO
PUSH default_case
JUMPI           // Jump to default case if the value is out of bounds
JUMP            // Jump to destination offset

The two assumptions it has to make are that (a) all offsets can be represented with two bytes and (b) the offset 0 is not a valid jump destination.

I have an idea for an even stronger optimization: if we can guarantee that the jump destinations of the individual cases are always greater than the offset of the default case, we can skip the out-of-bounds check entirely and instead just store the offsets from the default case location in the "jump table". I assume this might be difficult to guarantee in practice, but if we could do it, the execution gas cost would come all the way down to 35. As an example, if the default case is at offset 0x45, and case 0 is at 0x4d, we would store 0x08 in the jump table and just add 0x45 to it.

Offsets from the default case (35 gas):

PUSH2 0xffff    // Byte mask
PUSH10 0x0020001a0014000e0008   // Every two bytes is a destination
DUP3            // Get enum value
PUSH1 0x04
SHL             // Calculate shift amount: 16 * enum value
SHR
AND             // Apply two-byte mask
PUSH default_case
ADD
JUMP            // Jump to destination offset

@github-actions
Copy link

This issue has been marked as stale due to inactivity for the last 90 days.
It will be automatically closed in 7 days.

@github-actions github-actions bot added the stale The issue/PR was marked as stale because it has been open for too long. label Mar 27, 2023
@jaa2
Copy link
Contributor Author

jaa2 commented Mar 29, 2023

As an update to this, I was told that because Ethereum's trajectory was eventually going to do away with dynamic jumps, so the team is hesitant to pursue this optimization. The implementation of the jump table in the compiler and optimizer can be found in #12978.

@github-actions github-actions bot removed the stale The issue/PR was marked as stale because it has been open for too long. label Mar 30, 2023
@github-actions
Copy link

This issue has been marked as stale due to inactivity for the last 90 days.
It will be automatically closed in 7 days.

@github-actions github-actions bot added the stale The issue/PR was marked as stale because it has been open for too long. label Jun 29, 2023
@github-actions
Copy link

github-actions bot commented Jul 6, 2023

Hi everyone! This issue has been automatically closed due to inactivity.
If you think this issue is still relevant in the latest Solidity version and you have something to contribute, feel free to reopen.
However, unless the issue is a concrete proposal that can be implemented, we recommend starting a language discussion on the forum instead.

@github-actions github-actions bot added the closed due inactivity The issue/PR was automatically closed due to inactivity. label Jul 6, 2023
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jul 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
closed due inactivity The issue/PR was automatically closed due to inactivity. stale The issue/PR was marked as stale because it has been open for too long.
Projects
None yet
Development

No branches or pull requests

4 participants