-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test outcome analysis: check that all available test cases have been executed #3458
Test outcome analysis: check that all available test cases have been executed #3458
Conversation
Call all.sh for sanity checks, rather than maintain an explicit list. This was done in .travis.yml in 3c7ffd7 Travis has diverged from basic-in-docker. This commit updates the description of basic-in-docker to no longer refer to Travis. Alignment with Travis may be desirable but that is beyond the scope of this commit. Signed-off-by: Gilles Peskine <[email protected]>
You can't import a Python script whose name includes '-'. Signed-off-by: Gilles Peskine <[email protected]>
Parametrize the code that iterates over test case descriptions by the function to apply on each description. No behavior change. Signed-off-by: Gilles Peskine <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for writing this! Looks pretty good to me, just have a couple of questions and minor points.
Make the structure more Pythonic: use classes for abstraction and refinement, rather than higher-order functions. Convert walk(function, state, data) into instance.walk(data) where instance has a method that implements function and state is a field of instance. No behavior change. Signed-off-by: Gilles Peskine <[email protected]>
With previous refactorings, some functions are now solely meant to be called from other functions in a particular class. Move them into this class. No behavior change. Signed-off-by: Gilles Peskine <[email protected]>
This is a new script designed to analyze test outcomes collected during a whole CI run. This commit introduces the script, the code to read the outcome file, and a very simple framework to report errors. It does not perform any actual analysis yet. Signed-off-by: Gilles Peskine <[email protected]>
Check that every available test case in the test suites and ssl-opt.sh has been executed at least once. For the time being, only report a warning, because our coverage is incomplete. Once we've updated all.sh to have full coverage, this warning should become an error. Signed-off-by: Gilles Peskine <[email protected]>
b5b43b1
to
8d3c70a
Compare
I had accidentally erased the signoff line on two commits. Fixed in a force-push that doesn't change anything else. |
Signed-off-by: Gilles Peskine <[email protected]>
I've analyzed the test cases reported as not executed. Some of these are false positives:
I propose to treat all of these as known defects for now, to be corrected in follow-up PRs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
I think we can either merge this as is and create a follow-up PR to handle known issues (and switch from warning to errors for the rest), or expand this PR, as you prefer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks very good to me. I've been able to play with analyze_outcomes.py and the outcome of ./tests/scripts/all.sh test_full_cmake_gcc_asan: only 322 tests not run. Just this test covers already a lot. By targeting specific tests with other configurations, we may be able to reduce the duration of the CI to check the PRs. Maybe something to look at here. I've just two suggestions for comment improvements that you may want to address.
tests/scripts/check_test_cases.py
Outdated
file_name, line_number, description): | ||
"""Process a test case. | ||
|
||
per_file_state: a new object returned by per_file_state() for each file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
per_file_state is an object and a method, it was a bit confusing to me to start with. Would it be possible to improve the documentation, naming?
@@ -76,59 +76,98 @@ def check_description(results, seen, file_name, line_number, description): | |||
len(description)) | |||
seen[description] = line_number | |||
|
|||
def walk_test_suite(function, results, descriptions, data_file_name): | |||
"""Iterate over the test cases in the given unit test data file. | |||
class TestDescriptionExplorer: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a comment on this line but on the short file purpose description.
With the addition of this abstract calls to iterate over test cases maybe the short file description should be updated. For the time being it is still just: "Sanity checks for test data.".
I doubt that there's much to gain if anything. For most of the configurations, we aren't just interested in test cases that are specific to that particular configuration, but in all test cases, some of which trigger different behavior under the hood. There are even configurations where we aren't really interesting in running the tests, just in ensuring that the compilation succeeds, because the main risk is a missing |
@gilles-peskine-arm According to github, you requested a review from Ronald and me, but we had both approved the PR and I can't see any change since our approval. Was it intentional? If so, could you clarify? |
Signed-off-by: Gilles Peskine <[email protected]>
bbb3664
@mpg Oops. I had made the improvements suggested by Ronald, but I only just now pushed them to the correct branch. |
Ok, makes more sense now :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for the last changes.
Introduce a new script
tests/scripts/analyze_outcomes.py
which is meant to analyze the collected test outcome files from a whole CI run.In this pull request, the analysis does one thing: check the list of test cases that have run at least once against the list of available test cases (from unit tests and from
ssl-opt.sh
), and report the available test cases that haven't run at all. This is reported as a warning for now; in follow-ups, we should analyze the reports, work out which configurations we should add to exercise all the test cases, and finally crank up the warning to an error.This addresses #2691, but does not fix it, since a failure only produces a warning, not failed CI.
This pull request also required some preliminaries:
*-*.py
to*_*.py
. You can'timport
a script with a dash in its name.check_test_cases.py
(formerlycheck-test-cases.py
) so that the new script can reuse its test case gathering code.basic-in-docker.sh
. This script was aligned with Travis but I accidentally broke that in Rationalize Travis builds #3218. We should probably update it but it's out of scope of this PR; here I only do a minor documentation update.Backports:
basic-in-docker.sh
: no, not in any LTS.check-files.py
: yes, to keep the branches aligned. Backport 2.16: Rename Python scripts to use '_' and not '-' #3475, Backport 2.7: Rename Python scripts to use '_' and not '-' #3476.check_test_cases.py
: no, not in any LTS.