Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PERF: Add benchmarking? #1257

Closed
max-sixty opened this issue Feb 9, 2017 · 9 comments
Closed

PERF: Add benchmarking? #1257

max-sixty opened this issue Feb 9, 2017 · 9 comments

Comments

@max-sixty
Copy link
Collaborator

Because xarray is all python and generally not doing much compute itself (i.e. it marshals other libraries to do that), this hasn't been that important.

IIRC most of the performance issues have arisen where xarray builds on (arguably) shaky foundations, like PeriodIndex.

Though as we mature, is it worth adding some benchmarks?

If so, what's a good way to do this? Pandas uses asv successfully. I don't have experience with https://github.com/ionelmc/pytest-benchmark but that could be a lower cost way of getting started. Any others?

@shoyer
Copy link
Member

shoyer commented Feb 9, 2017

Yes, some sort of automated benchmarking could be valuable, especially for noticing and fixing regressions. I've done occasional benchmarks before to optimize bottlenecks (e.g., class constructors) but it's all been ad-hoc stuff with %timeit in IPython.

ASV seems like a pretty sane way to do this. pytest-benchmark can trigger test failures if performance goes below some set level but I suspect performance is too subjective and stochastic to be reliable.

@max-sixty
Copy link
Collaborator Author

Yes ASV is good. I'm surprised there isn't something you can ask to just "robustly time these tests", so it can bolt on without writing new code.
Although maybe the overlap between test code and benchmark code isn't as great as I imagine

@shoyer
Copy link
Member

shoyer commented Feb 10, 2017

One issue is that unit tests are often not good benchmarks. Ideal unit tests are as fast as possible, whereas ideal benchmarks should be run on more typical inputs, which may be much slower.

@rabernat
Copy link
Contributor

Another 👍 for benchmarking. Especially as we start to get deep into integrating dask.distributed, having robust performance benchmarks will be very useful. One challenge is where to deploy the benchmarks. TravisCI might not be ideal, since performance can vary depending on competition from other virtual machines on the same system.

@pwolfram
Copy link
Contributor

We would also benefit from this specifically for #1198 👍

@jhamman
Copy link
Member

jhamman commented Jun 12, 2017

Is anyone interested in working on this with me over the next few months? Given the number of issues we've been seeing, I'd like to see this come together this summer. I think ASV is the natural starting point.

@rabernat
Copy link
Contributor

I am very interested. I have been doing a lot of benchmarking already wrt dask.distributed on my local cluster, focusing on performance with multi-terabyte datasets. At this scale, certain operations emerge as performance bottlenecks (e.g. index alignment of multi-file netcdf datasets, #1385).

I think this should probably be done in AWS or Google Cloud. That way we can establish a consistent test environment for benchmarking. I might be able to pay for that (especially if our proposal gets funded)!

@jhamman
Copy link
Member

jhamman commented Jun 13, 2017

@rabernat - great. I've setup a ASV project and am in the process of teaching myself how that all works. I'm just playing with some simple arithmatic benchmarks for now but, of course, most of our interested will be in the i/o and dask arenas.

I'm wondering if @mrocklin has seen ASV used with any dask projects. We'll just need to make sure we choose the appropriate timer when profiling dask functions.

@mrocklin
Copy link
Contributor

@TomAugspurger has done some ASV work with Dask itself

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants