Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic number of benchmark repetitions #379

Open
ggwpez opened this issue Feb 16, 2023 · 3 comments
Open

Dynamic number of benchmark repetitions #379

ggwpez opened this issue Feb 16, 2023 · 3 comments

Comments

@ggwpez
Copy link
Member

ggwpez commented Feb 16, 2023

Problem

The benchmark pallet command is invoked with --steps and --repeat arguments which decide how often a benchmark should be repeated.
This can cause issues for very short benchmarks since the time-frame on the machine is too short to get constent performance.
It especially applies to VMs which can have CPU steal and other kind of short-lived disturbances, which do not impair the average performance but destroy the bench results.

Proposed Solution

Instead of running each bench with a fixed number of repetitions, we could look to criterion for more statistical driven results.
For example running the benchmark until we get consistent average results and error otherwise with "inconsistent hardware detected".

Ideally we could also get nicer prints to directly compare with past results 😄

Benchmarking DAG 1k/1k: Collecting 100 samples in estimated 5.0000 s (133M iter)
DAG 1k/1k               time:   [37.627 ns 37.723 ns 37.834 ns]
                        change: [-29.326% -28.527% -27.771%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
  2 (2.00%) low mild
  1 (1.00%) high mild
  7 (7.00%) high severe

@bkchr
Copy link
Member

bkchr commented Feb 16, 2023

For example running the benchmark until we get consistent average results and error otherwise with "inconsistent hardware detected".

When we have long running benchmarks, we will have problems :P

What I just thought about for this issue, could we maybe not use criterion? Or would that be too crazy?

@koute
Copy link
Contributor

koute commented Feb 20, 2023

What I just thought about for this issue, could we maybe not use criterion? Or would that be too crazy?

Unfortunately criterion doesn't seem to expose APIs which are low level enough for us to use as-is, but considering it is relatively small (~10k lines of code, most of which we shouldn't need anyway) and liberally licensed we could just copy the relevant code verbatim into a separate helper crate and use that; should be relatively straightforward to do from what I can see.

@ggwpez
Copy link
Member Author

ggwpez commented Feb 22, 2023

What I just thought about for this issue, could we maybe not use criterion? Or would that be too crazy?

Criterion requires nightly, but we could maybe work around that. Anyway for a first version I would just hack it together instead of doing a larger refactor to integrate Criterion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

No branches or pull requests

3 participants