-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance Benchmarks / Integration tests #1954
Comments
I did a google bigquery for github repos that have a pylintrc file. I'll probably use these as a starting point for integration tests. Here's the top 25:
|
Thanks for tackling this, Bryce! Would it be possible to have a script that reproduces the benchmark dataset? This way us and contributors would have a de-facto way for retrieving the dataset, running the integration tests and getting the performance results out of the dataset. |
Yeah, I'll be adding some docs to https://github.com/brycepg/pylint-corpus I've decided to just download source of each python repository into the corpus repository, so I don't have to worry about performance changing over new versions of the same repository. |
Some other notes: I tried to make pylint work on some really big repositories like pandas and numpy but pylint never completed, even over night |
Wow that is crazy! I wonder what graphs we could extract from pandas and numpy, maybe there are other pathological cases that we might need to look into. |
The following is edited content taken from PR #3473 . There are three key parts to achieving this properly:
Comparable BenchmarksIn PR #3473 we use pytest-benchmark to benchmark the code and we suggest a use-case for how to use it. The creates .json files that can be compared to .json files created from other runs. Which is very useful. However, for those of us familiar with performance regression testing, without ensuing that the two runs being compared were run under highly-similar conditions we are very likely to get false-postive/negative results. So, some thought needs to be given to how to make use of the .json files generated by pytest-benchmark. Ideas on a post-card. Side-by-side comparisonsFor example, to ensure a fair comparison, we might want to benchmark the target merge branch and the wip-branch within the same tox-run so we know we got comparable profiling information from two runs on the exact-same machine/instance/load. There may be a better way to do this. Prev runs via Cloud storageWhilst side-by-side comparisons would likely be the most reliable solution, it is likely to be useful to be able to record a history of performance so we can spot trends, share perfromance heatmaps, logs and so on. One very quick solution here is to use S3 because travis supports AWS S3 storage out-of-the-box. I already have code to upload benchmark data and heatmaps to an S3 instance. This means that PyCQA would have to do the same to support this PR. But there would be considerations around costs, monitoring and so on. Baselines (validation of thinking needed)To make this useful for any performance work we need to establish robust benchmark baselines. There's one attached to #3473 but we will need more for each of the expensive parts of the system. To help explain why we need to think about this carefully, whilst creating PR #3458 (which really enables full support of multiprocessing) and dependent work, discovering some hidden problems along the way and reading the various related issues around it (e.g. #1954), it became clear that the improving performance in pylint is not obvious to someone not intimately familiar with pylint, checkers, pylint-configuration and pylint's dependencies (astroid). |
✨ This is an old work account. Please reference @brandonchinn178 for all future communication ✨
This also affects repos importing pandas + numpy. #5835 is one minimal example of this, but more generally, running With prospector, it's currently ~40s, after commenting out all pandas + numpy + tqdm imports it's down to 30s, commenting out scipy + matplotlib + seaborn also brings it down to 15s. |
Pylint needs a collection of performance integration tests (see pylint-dev/astroid#519) help further optimization efforts.
They don't need to be run as part of the standard CI or with just calling
pytest
since they will be slow, but we should have them to make sure optimizations efforts aren't over optimizing specific projects and so we know which pull requests will slow down astroid/pylint.I'm thinking that we should pick a specific tag of a specific project, and then set up
pytest-benchmark
tosetup
by downloading/extracting the correct zip and running pylint against those files on a developer's machine, and timing how long it takes for pylint to execute and if pylint raises any internal errors.These tests additionally might be used to look for consistency in output and changes in pylint output messages.
The text was updated successfully, but these errors were encountered: