-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Connect ASV benchmarking to CI #496
Conversation
By this, do you mean squashing any excessive commits? Using myself as a bad example, I made the mistake of not squashing a series of small commits before merging a PR (see here). Would ASV run a benchmark on each of these commits? |
With the way we're currently set up in this PR, yes. The (base) ➜ uxarray git:(main) git rev-list v2023.10.1..main
97fc1a2f443d9ffc997460530f6ff3ed7ae296c3
505209bd783920a6fcc6b4ab0fefd1a4f3df8873
e7c85ac19e9b256b1b63713ead2bff1db362e220
fddb407037f953fc1b8d3d6921eb0f40b3fb6b29
2bd81d47ef6827ec11f7f6d4104678a783b6c7d3
01a34ffa2ff4e9c027871fee72aa65d608ae63ff
9e1c9605845449ecc09aaa3e0a164ac564ed1502
feb0985ba01498c39411ac92fe08408ac7b15c9d
5bf042b3f4fef84f7641a47194cc6fe6057135ea
c72b6e9d4958a9ba159955941abfa5fa039979ff
487ce1f957521d90e3de0636f24fe42a8b7c8056
84bc776b8892ddc3b2d58b84ea3e9dc9d1cd325a
28fcf1dfdf93c861f5fd185d4c9748d9e1973607
6cf5cde795936b2ebd6ff876998fd1593f533218
802749dcebe69b06084a54ee7bf487895cce1ea4
8394dd5e1a2240f8867eea82a52c78de04d9faa6
01e13ca4d921140c1df7aafcb8d3fe710ce7360b
f4c57fe0928fa2927fd2a12b04b12927f53a2cb1
b43157dc39969530058a9e8dfd0d9fa1418f7bda
f046cea7b9bb48e5d8fe645999058cfdd89ec076
d28f3b30e7606063dfb076adea86b66d701e0329
69308cdfe539b967b7602664de86f597c0304abe
da03e5a440671b33f2764acc1b4772eda266f836
5133a62e637b6f5feada702c5ef825f58b93ca30
9b50bdf30eed249b9af2bd931863f4330f78f515
943c9d8d3238ad1e27ff51f13cd04f1e8303eed1
7d444f7f509cd8cb2caa09ed51cf14b682876743
535fbb7e6a054d994cdd88985ce3029dbfd91fde
59d6fd1627439f6d0fcd7d5bb7a77771f0e61c47
f38e4c9fe1acd943c87bd14849c9922097cd8fb0
8baf53dd53360daadd21669ef27d185906605f76
147e2314f131aa53ab426f0cbe4bf6a63658935a
3ed9a94df5873228b966539e71326b484ebcfe45
a00aa1062adb9dbf667299c26cb58bdba8c3d522
b23d33120cff9aedb5992bcbda5fc5fe46060682
63e7d5f06d406d69b55f5c6bda222064d7c3a64b
828f29654927feeefdef2d809fabe1ce60b4cc22
dbbf2362f5c78dbaaab29bbf7bcdbce927d0a003
6bb7416728b4c5d6fcb5d18a8e33a4dff729c0f3
e6d2a33be122553a9fb7026f6522ea6e7fefc608
f00e182f0aaffbe9631d4ec070e5a5ed19d1bea1
7a1bccfd33bc0bf2a1da95e19b4cac3f90a3fb1c |
Got it. What about only running benchmarks on commits that originated after merging a pull request? Could that possibly be one solution? |
(base) ➜ uxarray git:(main) git rev-list main --first-parent v2023.10.1..main
9158ae47c46071d8c1a5e5e5d3f07c66279dfe22
97fc1a2f443d9ffc997460530f6ff3ed7ae296c3
505209bd783920a6fcc6b4ab0fefd1a4f3df8873
e7c85ac19e9b256b1b63713ead2bff1db362e220
fddb407037f953fc1b8d3d6921eb0f40b3fb6b29
8baf53dd53360daadd21669ef27d185906605f76
147e2314f131aa53ab426f0cbe4bf6a63658935a
3ed9a94df5873228b966539e71326b484ebcfe45
63e7d5f06d406d69b55f5c6bda222064d7c3a64b
EDIT: So, I did manage to get it to run with the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I'm also a bit worried about the "failure" state being confusing, but ok with merging for now and seeing how that plays out in practice
Not so much about this specific PR, but the set-up here in general:
- This will likely run on a lot of commits. We'll need to take that into account when writing benchmarks and perhaps create another class of benchmarks to run elsewhere.
- Given that we may be running on
differentshared hardware (and not the most controlled environment), it'll be interesting to see how much variability we have in run time for various benchmarks.
I think these are just things we figure out as we go though.
I guess one more question before I approve this... If we did end up refactoring how this runs and wanted to rerun (say from a different release) and start over on the archive repo, how much of a problem would that be?
Thats good to hear then! |
Adding to what @kafitzgerald mentioned, I'm worried on what type of results the default actions instance would produce, since I don't belive they have substantial hardware. For an "automatic" action that we run, would it not be worth considering a fixed hardware (i.e. known CPU, Memory) instance on something like AWS (or even casper, if that's possible) that runs the bencharmk, instead of the default action instance? However, ASV does support keeping track of the machine that ran the bencharmk (see numpy), so we should be able to distinguish between these automatic ones and manually comitted benchmarks. EDIT: This could also cause issues if, when running on Github actions, we get an instance that has different hardware than the previous run, which would lead to adding another "machine" to our list. |
Fair enough, I think this won't actually happen that often, if ever, in the regular use of this action.
Yeah, totally. Any weighty benchmarks should probably be filtered out of this regular CI
We can reconfigure this action without needing to delete all our archives (though we could also do that too). The results are only grouped by machine, which we're manually setting. If we wanted to switch to start from an earlier release, we could add to the existing archive by altering the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm on the same page now, this looks great!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great!
PR Summary
Closes #472
To see this in action, see my fork.
In this PR, I made a couple changes from the proof of concept runs above:
v2022.10.0
tag forward. This will take a while to run the first time, but should only produce long CI runs when a new benchmark is added in the future.Here are a couple of design decisions that might warrant further discussion:
v2022.10.0
too long or too short?PR Checklist
General
closes #XXX
to the PR description where XXX is the number of the issue.docs/release-notes.rst
in a relevant section for the next unreleased release. Possible sections include: Documentation, New Features, Bug Fixes, Internal Changes, Breaking Changes/Deprecatedprecommit
. To set up on your local, runpre-commit install
from the top level of the repository. To manually run pre-commits, usepre-commit run --all-files
and re-add any changed files before committing again and pushing.