-
-
Notifications
You must be signed in to change notification settings - Fork 382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide finer-grained statistics #22
Comments
p95/p99/p99.9 values ofcourse would be great but an option to output the entire histogram both in machine consumable format (json/csv) and/or graphically would be amazing. |
@josephglanville You should be able to do all of that with a HdrHistogram. Of course, it does pay some cost of accuracy in order to remain compact, but it's unlikely that users will notice. The exact trade-offs are also easy to control. |
This sounds really interesting. Is it possible to get reasonable estimates from HdrHistogram even if we only have very few samples (like 10 time measurements)? |
HdrHistogram doesn't actually do any statistical analysis, it just gathers up recorded values in an efficient way. Think of it as keeping a count per bin (e.g., "there have been 3 samples in the range 1-3ms, 2 in 3-5, etc."), but in a "smart" way such that you generally always have good resolution, and the histogram doesn't grow unbounded if you have large discrepancies in the values. |
Which then in turn can give you whatever percentile you want, though obviously not with higher fidelity than what the number of samples provides. There will be a small inaccuracy due to the binning, but it should be marginal. |
@jonhoo The thing is that (i think) in most cases CLI testing will be used for less than < 100000 iterations, so you will be saving barely any memory while introducing inaccuracies. It might make sense if that was something that activates above certain threshold, but then 1 milion floats takes 8 MB if you wanted to just use "dumb" algorithm so I doubt it is worth it. |
@XANi you're totally right that with few samples the advantages of using HdrHistogram aren't as compelling. That said, the inaccuracies will also likely be very small, and HdrHistogram does present a nice interface for getting percentile values. It would also be (marginally) faster than scanning all the recorded samples after the fact to compute the percentiles. |
I personally would love, if the |
This is exactly what the JSON-output option is for. Downstream tools will get all of the benchmarking information from that output, right? |
Yes, but #110 then I suggest to rename the flag, such as `--summ-as-csv`
or so. From interface I cannot infer that json contains ALL the
measurements and csv doesn't. The help text also doesn't undisclose this:
```
--export-csv <FILE>
Export the timing results as CSV to the given FILE.
…--export-json <FILE>
Export the timing results as JSON to the given FILE.
```
|
Ok, good point. Let's clarify things in the |
In the spirit of the last few comments, I have created a folder with exemplary Python scripts that can be used to further analyze benchmarks that have been performed with https://github.com/sharkdp/hyperfine/tree/master/scripts The
With this, I'd like to close this ticket, but I'd be really glad to get some feedback on this. If someone has ideas on how to improve or extend these scripts, I'd also be happy to take pull requests. |
It'd be nice to be able to show other statistics such as the 95th percentile or median runtime. HdrHistogram can record this with relatively little overhead, and there's a pretty good official Rust implementation here (I'm one of the maintainers).
The text was updated successfully, but these errors were encountered: