Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stylus_benchmark #2827

Merged
merged 58 commits into from
Dec 27, 2024
Merged

stylus_benchmark #2827

merged 58 commits into from
Dec 27, 2024

Conversation

diegoximenes
Copy link
Contributor

@diegoximenes diegoximenes commented Dec 11, 2024

Resolves NIT-2757

This PR adds the stylus_benchmark binary.
It will mainly be used to fine tune the ink prices that are charged today.

It deterministically creates .wat programs with lots of instructions to be benchmarked.

If this PR is approved then more .wat programs generations, benchmarking other instructions, will be added in a future PR.

This PR introduces two new WASM instruction to jit, start_benchmark and end_benchmark.
Code blocks between start_benchmark and end_benchmark instructions will be benchmarked.

stylus_benchmark uses jit as a library.

@cla-bot cla-bot bot added the s Automatically added by the CLA bot if the creator of a PR is registered as having signed the CLA. label Dec 11, 2024
@diegoximenes diegoximenes requested a review from tsahee December 11, 2024 19:27
arbitrator/tools/stylus_benchmark/src/generate_wats.rs Outdated Show resolved Hide resolved
arbitrator/arbutil/src/benchmark.rs Outdated Show resolved Hide resolved
arbitrator/arbutil/src/benchmark.rs Show resolved Hide resolved
arbitrator/tools/stylus_benchmark/src/benchmark.rs Outdated Show resolved Hide resolved

let exec = &mut WasmEnv::default();

let module = exec_program(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this code benchmarking only the JIT execution? I'm worried about the differences between the JIT and the interpreter, so it would be nice to benchmark non-JIT execution as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a good point.
By interpreter you mean execution through the prover's binary?

@tsahee, any takes on that?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the prover the bottleneck is not how long it takes the prover to execute but how many steps it takes - because that affects how long it will take to hash for BOLD.
And in stylus, that's a very different constraint.

@diegoximenes
Copy link
Contributor Author

diegoximenes commented Dec 23, 2024

general approach seems great. would like to get more reviews but I think it's already interesting to add a few benchmarks and start looking at results.

I intend to add more scenarios to be benchmarked in a future PR, together with a more detailed look at results.

But here goes the output of the benchmarks implemented in this PR, that was run in a Apple M3 Pro:

No scenario specified, benchmarking all scenarios

Benchmarking add_i32
Run 0, duration: 120.735917ms, ink_spent: Ink(29298411152)
Run 1, duration: 108.255875ms, ink_spent: Ink(29298411152)
Run 2, duration: 110.482833ms, ink_spent: Ink(29298411152)
Run 3, duration: 95.918417ms, ink_spent: Ink(29298411152)
Run 4, duration: 95.604125ms, ink_spent: Ink(29298411152)
Run 5, duration: 95.353792ms, ink_spent: Ink(29298411152)
Run 6, duration: 95.531375ms, ink_spent: Ink(29298411152)
After discarding top and bottom runs:
avg_duration: 99.926139ms, avg_ink_spent_per_micro_second: 293201
Benchmarking xor_i32
Run 0, duration: 96.054583ms, ink_spent: Ink(29298411152)
Run 1, duration: 96.208625ms, ink_spent: Ink(29298411152)
Run 2, duration: 97.552791ms, ink_spent: Ink(29298411152)
Run 3, duration: 96.658ms, ink_spent: Ink(29298411152)
Run 4, duration: 96.164417ms, ink_spent: Ink(29298411152)
Run 5, duration: 96.1905ms, ink_spent: Ink(29298411152)
Run 6, duration: 96.149083ms, ink_spent: Ink(29298411152)
After discarding top and bottom runs:
avg_duration: 96.187847ms, avg_ink_spent_per_micro_second: 304598

Copy link
Contributor

@gligneul gligneul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I think measuring the interpreter (prover) performance is important as well, but you can leave this for a future PR.

exec_program(exec, module, calldata, config, evm_data, gas)
}

pub fn exec_program(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exec_program vs start_program is very confusing.
Maybe launch_program_thread?

let args = Args::parse();

match args.scenario {
Some(scenario) => handle_scenario(scenario, args.output_wat_dir_path),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should have some option (either general --help or --schenario help) that lists all scenarios


#[derive(StructOpt)]
#[structopt(name = "jit-prover")]
pub struct Opts {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does that really need to be in lib.rs? I don't see it used elsewhere?

@tsahee tsahee merged commit 2409196 into master Dec 27, 2024
17 checks passed
@tsahee tsahee deleted the stylus_bench branch December 27, 2024 00:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
design-approved s Automatically added by the CLA bot if the creator of a PR is registered as having signed the CLA.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants