Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide finer-grained statistics #22

Closed
jonhoo opened this issue Jan 22, 2018 · 12 comments
Closed

Provide finer-grained statistics #22

jonhoo opened this issue Jan 22, 2018 · 12 comments
Labels
feature-request question Further information is requested

Comments

@jonhoo
Copy link

jonhoo commented Jan 22, 2018

It'd be nice to be able to show other statistics such as the 95th percentile or median runtime. HdrHistogram can record this with relatively little overhead, and there's a pretty good official Rust implementation here (I'm one of the maintainers).

@josephglanville
Copy link

p95/p99/p99.9 values ofcourse would be great but an option to output the entire histogram both in machine consumable format (json/csv) and/or graphically would be amazing.

@jonhoo
Copy link
Author

jonhoo commented Jan 22, 2018

@josephglanville You should be able to do all of that with a HdrHistogram. Of course, it does pay some cost of accuracy in order to remain compact, but it's unlikely that users will notice. The exact trade-offs are also easy to control.

@sharkdp
Copy link
Owner

sharkdp commented Jan 22, 2018

This sounds really interesting.

Is it possible to get reasonable estimates from HdrHistogram even if we only have very few samples (like 10 time measurements)?

@sharkdp sharkdp added feature-request question Further information is requested labels Jan 22, 2018
@jonhoo
Copy link
Author

jonhoo commented Jan 22, 2018

HdrHistogram doesn't actually do any statistical analysis, it just gathers up recorded values in an efficient way. Think of it as keeping a count per bin (e.g., "there have been 3 samples in the range 1-3ms, 2 in 3-5, etc."), but in a "smart" way such that you generally always have good resolution, and the histogram doesn't grow unbounded if you have large discrepancies in the values.

@jonhoo
Copy link
Author

jonhoo commented Jan 25, 2018

Which then in turn can give you whatever percentile you want, though obviously not with higher fidelity than what the number of samples provides. There will be a small inaccuracy due to the binning, but it should be marginal.

@XANi
Copy link

XANi commented Mar 26, 2018

@jonhoo The thing is that (i think) in most cases CLI testing will be used for less than < 100000 iterations, so you will be saving barely any memory while introducing inaccuracies. It might make sense if that was something that activates above certain threshold, but then 1 milion floats takes 8 MB if you wanted to just use "dumb" algorithm so I doubt it is worth it.

@jonhoo
Copy link
Author

jonhoo commented Mar 26, 2018

@XANi you're totally right that with few samples the advantages of using HdrHistogram aren't as compelling. That said, the inaccuracies will also likely be very small, and HdrHistogram does present a nice interface for getting percentile values. It would also be (marginally) faster than scanning all the recorded samples after the fact to compute the percentiles.

@psteinb
Copy link
Contributor

psteinb commented Dec 3, 2018

I personally would love, if the --output-* options would store the timing results of each individual run instead of the summary statistics. This way creating histograms and such can be handled downstream by other tools/languages.

@sharkdp
Copy link
Owner

sharkdp commented Dec 3, 2018

I personally would love, if the --output-* options would store the timing results of each individual run instead of the summary statistics. This way creating histograms and such can be handled downstream by other tools/languages.

This is exactly what the JSON-output option is for. Downstream tools will get all of the benchmarking information from that output, right?

@psteinb
Copy link
Contributor

psteinb commented Dec 4, 2018 via email

@sharkdp
Copy link
Owner

sharkdp commented Dec 5, 2018

Ok, good point. Let's clarify things in the --help text. I'd like to keep the current names for the command-line options.

@sharkdp
Copy link
Owner

sharkdp commented Dec 12, 2018

In the spirit of the last few comments, I have created a folder with exemplary Python scripts that can be used to further analyze benchmarks that have been performed with hyperfine:

https://github.com/sharkdp/hyperfine/tree/master/scripts

The advanced_statistics.py script, for example, shows median, percentiles and interquartile range:

> ./advanced_statistics.py result-gaussian-distribution-mean-0.5-stddev-0.05.json
Command './gaussian.py'
  mean:      0.507 s
  stddev:    0.050 s
  median:    0.509 s

  percentiles:
     P_05 .. P_95:    0.424 s .. 0.583 s
     P_25 .. P_75:    0.472 s .. 0.543 s  (IQR = 0.071 s)

With this, I'd like to close this ticket, but I'd be really glad to get some feedback on this. If someone has ideas on how to improve or extend these scripts, I'd also be happy to take pull requests.

@sharkdp sharkdp closed this as completed Dec 12, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request question Further information is requested
Projects
None yet
Development

No branches or pull requests

5 participants