Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Save results to file and compare #2

Closed
ionelmc opened this issue Oct 11, 2014 · 1 comment
Closed

Save results to file and compare #2

ionelmc opened this issue Oct 11, 2014 · 1 comment
Assignees

Comments

@ionelmc
Copy link
Owner

ionelmc commented Oct 11, 2014

16:36 <hpk> ionelmc: what i'd need would be a way to compare agsinst prior benchmarks
16:37 <ionelmc> hpk: that means you'd need a way to measure relative times against a "control benchmark"
16:37 <hpk> ionelmc: i.e. "py.test --bench-compare path-to-old-benchmarkresults"
16:37 <ionelmc> because machines don't have same perf
16:38 <hpk> yes, writing out of results as well as comparing against them and getting errors when they slowed too much
16:38 <hpk> so probably a "py.test --bench-as-control-sample" and "py.test --bench"
16:38 <hpk> (conceptually)
16:38 <ionelmc> hpk: in other words, you'd be comparing percentages, not actual timings
16:39 <hpk> i'd be looking how much a benchmark deviates
16:39 <hpk> would report all deviations and say that >10% slowdown is an error or so
16:39 <ionelmc> nooo, i thing you missed my point
16:40 <ionelmc> so, you have a "control test"
16:40 <ionelmc> that does something, whatever, something simple
16:40 <hpk> what i said was not directly related to what you said before -- more what i think would be useful for myself
16:40 <ionelmc> and the other tests compare to that
16:40 <ionelmc> eg: 50% slower than the "control bench"
16:41 <hpk> might be useful for some people, not for me, i guess
16:41 <ionelmc> and in the file you only save percentages (the relative values to thecontrol test)
16:41 <ionelmc> otherwise saving to a file is not practical
16:41 <ionelmc> i'm thinking travis
16:41 <hpk> ah, now i get it
16:41 <ionelmc> i run it locally but travis is gonna be very unpredictable
16:41 <ionelmc> ever between runs
16:41 <hpk> i don't know if this coulid work
16:42 <hpk> but it's an interesting idea
16:42 <ionelmc> so the only reliable thing to compare against is a "contro test" that is ran in the same session
16:42 <hpk> question is if you can do a control test that makes sense
16:42 <ionelmc> eg, i wanna benchmark instance creation of some objects
16:42 <hpk> and where the relation "realtest versus controltest" is stable
16:42 <hpk> across machines and interpreters
16:43 <hpk> i somehow doubt it
16:43 <ionelmc> and the control is "object()"
16:43 <ionelmc> ofcourse some things will be slower on some interpreters
16:43 <hpk> you need to try and run such things on multiple machines, including travis, to find out if it's viable i guess
16:44 <ionelmc> i think it's best to just have a nice way to look at historical data
16:44 <ionelmc> eg, a separate service that records timings
16:44 <hpk> what i proposed should work without having to figure out control tests but you need a somewhat repeatable environment
16:44 <ionelmc> like coveralls but for benchmarks
16:45 <ionelmc> https://coveralls.io/
16:45 <ionelmc> coveralls integrates well into travis
16:46 <ionelmc> hpk: well, repeatable environments are a luxury
16:47 <ionelmc> with all the crazy virtualization and even crazy cpu scaling (intel turboboost) it's fairly hard
16:47 <hpk> ionelmc: yes -- the other question is if it's possible to count CPU ticks used for a computation rather than time
16:48 <hpk> ionelmc: but it's even hard within the lifetime of one process
16:48 <ionelmc> hmmmm
16:48 <ionelmc> that should work
16:48 <ionelmc> you only need to have the same cpu then
16:49 <hpk> on travis i guess during 60 seconds of a test run you might experience different speeds
16:49 <hpk> so doing a control run first, then benchmarks might or might not be accurate enough
16:49 <ionelmc> wildly different i'd add :)
17:05 <hpk> for me it all becomes only useful with the comparison feature, but then it would be very useful
17:05 <hpk> (i am currently doing some benchmarking but manually, and i'd love to have a more systematic approach)
17:06 <ionelmc> hpk: so you're basically assuming consistent environments, like, you're not going to use it on travis
17:06 <hpk> yes
17:06 <ionelmc> only locally, to get feedback on perf regression
17:07 <hpk> yes, so if pytest had that, prior to pytest-2.7 would check it didn't regress
17:07 <hpk> or even just for a PR
17:07 <ionelmc> yeah, sounds very useful
17:08 <hpk> and then integrate the web code behind http://speed.pypy.org/ :)
17:12 <hpk> i'd be fine with just terminal reporting, already, though :)
17:16 <ionelmc> ok, what would be an error situation
17:17 <ionelmc> compare against minimums
17:17 <ionelmc> what's a good default for error threshold ?
17:30 <hpk> ionelmc: no clue, 10% maybe?
@ionelmc
Copy link
Owner Author

ionelmc commented Aug 7, 2015

Planned options:

  --benchmark-save=[NAME]
                        Save the current run into 'STORAGE-PATH/counter-
                        NAME.json'. Default: 'a79f3922132a2aa6db96862909b0e5d9
                        836e355d_20150807_032757'
  --benchmark-autosave  Autosave the current run into 'STORAGE-PATH/counter-
                        commit_id.json
  --benchmark-compare=[NUM]
                        Compare the current run against run NUM or the latest
                        saved run if unspecified.
  --benchmark-storage=STORAGE-PATH
                        Specify a different path to store the runs (when
                        --benchmark-save or --benchmark-autosave are used).
                        Default: './.benchmarks/'
  --benchmark-histogram=[FILENAME-PREFIX]
                        Plot graphs of min/max/avg/stddev over time in
                        FILENAME-PREFIX-test_name.svg. Default:
                        'benchmark_20150807_032757'
  --benchmark-json=PATH
                        Dump a JSON report into PATH.

@ionelmc ionelmc self-assigned this Aug 7, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant