-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CI: add benchmark test suite #8538
Conversation
rouault
commented
Oct 10, 2023
•
edited
Loading
edited
- Uses pytest-benchmark
- Add a autotest/benchmark module (now with only a few tests for the GTiff and GPKG driver)
- Add a new CI benchmark configuration, which builds the current tree and a reference tree, and run the benchmark test suite in both
86c21dd
to
ddb00d5
Compare
CC @dbaston |
1b8bd41
to
f6466d9
Compare
return filename | ||
|
||
|
||
def test_ogr_gpkg_spatial_index(source_file): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think bringing in the benchmark
fixture explicitly would let you more clearly exclude the setup cost. It's not clear to me if the runtime of source_file
is part of the timing or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not clear to me if the runtime of
source_file
is part of the timing or not.
it is not, comparing timings with the somewhat more verbose below version. source_file is run once as expected
def test_ogr_gpkg_spatial_index(source_file, benchmark):
@benchmark
def do():
ds = ogr.Open(source_file)
lyr = ds.GetLayer(0)
lyr.SetSpatialFilterRect(1000, 1000, 10000, 10000)
count = 0
for f in lyr:
count += 1
assert count == 10000 - 1000 + 1
@@ -269,6 +276,13 @@ jobs: | |||
TEST_CMD="ctest -V -j $(nproc)" | |||
fi | |||
|
|||
if test "${{ matrix.id }}" = "benchmarks"; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move into test.sh
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this can be done under Docker. Anyway it doesn't look like this is even hit (perhaps because we already run in a container/VM that doesn't expose turboboost setting)
d35198a fails as expected (https://github.com/OSGeo/gdal/actions/runs/6473107275/job/17575133870?pr=8538):
|
changing to use min statistics as suggest by @DFEvans on the mailing list |
@dbaston I've written more tests, and it seems we now hit unreliability of timings. Initial failed run in https://github.com/OSGeo/gdal/actions/runs/6482985166/job/17603597567 using the naive procedure (running the reference test suite, and then the one of the PR) I've attempted to modify the test procedure (cf updated benchmark/test.sh) to do more runs to attempt mitigating this, but given See https://github.com/OSGeo/gdal/actions/runs/6485533623/job/17611807035?pr=8538 Any ideas (except increasing significantly the tolerance threshold to let's say 30% to be hopefully robust, but we won't catch up small perf regressions) or are we hitting a dead end? |
more stable timings with --benchmark-timer=time.process_time in https://github.com/OSGeo/gdal/actions/runs/6486353872/job/17614377148?pr=8538. |
A retry failed : https://github.com/OSGeo/gdal/actions/runs/6486353872/job/17617138599?pr=8538 . Up to 16% of slowdown on one test. Bumping tolerance to 20% ... |
merging this |