Test outcome analysis: check that all available test cases have been executed #3458

gilles-peskine-arm · 2020-06-25T18:25:24Z

Introduce a new script tests/scripts/analyze_outcomes.py which is meant to analyze the collected test outcome files from a whole CI run.

In this pull request, the analysis does one thing: check the list of test cases that have run at least once against the list of available test cases (from unit tests and from ssl-opt.sh), and report the available test cases that haven't run at all. This is reported as a warning for now; in follow-ups, we should analyze the reports, work out which configurations we should add to exercise all the test cases, and finally crank up the warning to an error.

This addresses #2691, but does not fix it, since a failure only produces a warning, not failed CI.

This pull request also required some preliminaries:

Rename *-*.py to *_*.py. You can't import a script with a dash in its name.
Refactor check_test_cases.py (formerly check-test-cases.py) so that the new script can reuse its test case gathering code.
A tweak to basic-in-docker.sh. This script was aligned with Travis but I accidentally broke that in Rationalize Travis builds #3218. We should probably update it but it's out of scope of this PR; here I only do a minor documentation update.

Backports:

basic-in-docker.sh: no, not in any LTS.
Rename check-files.py: yes, to keep the branches aligned. Backport 2.16: Rename Python scripts to use '_' and not '-' #3475, Backport 2.7: Rename Python scripts to use '_' and not '-' #3476.
Rename and refactor check_test_cases.py: no, not in any LTS.
The new script: no, there are no outcome files in the LTS branches.

Call all.sh for sanity checks, rather than maintain an explicit list. This was done in .travis.yml in 3c7ffd7 Travis has diverged from basic-in-docker. This commit updates the description of basic-in-docker to no longer refer to Travis. Alignment with Travis may be desirable but that is beyond the scope of this commit. Signed-off-by: Gilles Peskine <[email protected]>

You can't import a Python script whose name includes '-'. Signed-off-by: Gilles Peskine <[email protected]>

Parametrize the code that iterates over test case descriptions by the function to apply on each description. No behavior change. Signed-off-by: Gilles Peskine <[email protected]>

mpg

Thanks for writing this! Looks pretty good to me, just have a couple of questions and minor points.

tests/scripts/analyze_outcomes.py

Make the structure more Pythonic: use classes for abstraction and refinement, rather than higher-order functions. Convert walk(function, state, data) into instance.walk(data) where instance has a method that implements function and state is a field of instance. No behavior change. Signed-off-by: Gilles Peskine <[email protected]>

With previous refactorings, some functions are now solely meant to be called from other functions in a particular class. Move them into this class. No behavior change. Signed-off-by: Gilles Peskine <[email protected]>

This is a new script designed to analyze test outcomes collected during a whole CI run. This commit introduces the script, the code to read the outcome file, and a very simple framework to report errors. It does not perform any actual analysis yet. Signed-off-by: Gilles Peskine <[email protected]>

Check that every available test case in the test suites and ssl-opt.sh has been executed at least once. For the time being, only report a warning, because our coverage is incomplete. Once we've updated all.sh to have full coverage, this warning should become an error. Signed-off-by: Gilles Peskine <[email protected]>

gilles-peskine-arm · 2020-06-26T16:31:14Z

I had accidentally erased the signoff line on two commits. Fixed in a force-push that doesn't change anything else.

Signed-off-by: Gilles Peskine <[email protected]>

gilles-peskine-arm · 2020-06-26T20:17:55Z

I've analyzed the test cases reported as not executed. Some of these are false positives:

SSL tests whose description involves shell expansion (\", $1). I'm thinking of fixing this by adding an option ssl-opt.sh --list-available-test-cases and using its output instead of hand-parsing ssl-opt.sh.
Deliberately skipped SSL tests. This could also be resolved with ssl-opt.sh --list-available-test-cases.

I propose to treat all of these as known defects for now, to be corrected in follow-up PRs.

mpg

LGTM.

I think we can either merge this as is and create a follow-up PR to handle known issues (and switch from warning to errors for the rest), or expand this PR, as you prefer.

ronald-cron-arm

This looks very good to me. I've been able to play with analyze_outcomes.py and the outcome of ./tests/scripts/all.sh test_full_cmake_gcc_asan: only 322 tests not run. Just this test covers already a lot. By targeting specific tests with other configurations, we may be able to reduce the duration of the CI to check the PRs. Maybe something to look at here. I've just two suggestions for comment improvements that you may want to address.

ronald-cron-arm · 2020-07-02T13:51:59Z

tests/scripts/check_test_cases.py

+                          file_name, line_number, description):
+        """Process a test case.
+
+per_file_state: a new object returned by per_file_state() for each file.


per_file_state is an object and a method, it was a bit confusing to me to start with. Would it be possible to improve the documentation, naming?

ronald-cron-arm · 2020-07-02T14:03:32Z

tests/scripts/check_test_cases.py

@@ -76,59 +76,98 @@ def check_description(results, seen, file_name, line_number, description):
                        len(description))
    seen[description] = line_number

-def walk_test_suite(function, results, descriptions, data_file_name):
-    """Iterate over the test cases in the given unit test data file.
+class TestDescriptionExplorer:


Not a comment on this line but on the short file purpose description.

With the addition of this abstract calls to iterate over test cases maybe the short file description should be updated. For the time being it is still just: "Sanity checks for test data.".

gilles-peskine-arm · 2020-07-02T22:36:25Z

By targeting specific tests with other configurations, we may be able to reduce the duration of the CI to check the PRs.

I doubt that there's much to gain if anything. For most of the configurations, we aren't just interested in test cases that are specific to that particular configuration, but in all test cases, some of which trigger different behavior under the hood. There are even configurations where we aren't really interesting in running the tests, just in ensuring that the compilation succeeds, because the main risk is a missing #ifdef somewhere.

mpg · 2020-07-03T07:29:02Z

@gilles-peskine-arm According to github, you requested a review from Ronald and me, but we had both approved the PR and I can't see any change since our approval. Was it intentional? If so, could you clarify?

Signed-off-by: Gilles Peskine <[email protected]>

gilles-peskine-arm · 2020-07-03T07:34:12Z

@mpg Oops. I had made the improvements suggested by Ronald, but I only just now pushed them to the correct branch.

mpg · 2020-07-03T07:35:28Z

Ok, makes more sense now :)

mpg

LGTM

ronald-cron-arm

LGTM, thanks for the last changes.

gilles-peskine-arm added 3 commits June 25, 2020 14:22

Rename Python scripts to use '_' and not '-'

fb4f933

You can't import a Python script whose name includes '-'. Signed-off-by: Gilles Peskine <[email protected]>

check_test_cases: parametrize iteration functions by the action

d34e9e4

Parametrize the code that iterates over test case descriptions by the function to apply on each description. No behavior change. Signed-off-by: Gilles Peskine <[email protected]>

gilles-peskine-arm added enhancement needs-review Every commit must be reviewed by at least two team members, needs-backports Backports are missing or are pending review and approval. component-platform Portability layer and build scripts labels Jun 25, 2020

gilles-peskine-arm self-assigned this Jun 25, 2020

mpg reviewed Jun 26, 2020

View reviewed changes

tests/scripts/analyze_outcomes.py Show resolved Hide resolved

tests/scripts/analyze_outcomes.py Show resolved Hide resolved

tests/scripts/analyze_outcomes.py Show resolved Hide resolved

tests/scripts/analyze_outcomes.py Show resolved Hide resolved

gilles-peskine-arm added 4 commits June 26, 2020 18:29

check_test_cases: move some functions into the logical class

6f6ff33

With previous refactorings, some functions are now solely meant to be called from other functions in a particular class. Move them into this class. No behavior change. Signed-off-by: Gilles Peskine <[email protected]>

gilles-peskine-arm force-pushed the analyze_outcomes-count_test_cases-1 branch from b5b43b1 to 8d3c70a Compare June 26, 2020 16:30

Document the fields of TestCasesOutcomes

3d863f2

Signed-off-by: Gilles Peskine <[email protected]>

gilles-peskine-arm requested a review from mpg June 26, 2020 16:32

gilles-peskine-arm mentioned this pull request Jun 26, 2020

Check that all test cases are executed at least once #2691

Closed

gilles-peskine-arm mentioned this pull request Jun 26, 2020

Fix some test cases that weren't getting executed #3463

Merged

mpg previously approved these changes Jun 29, 2020

View reviewed changes

gilles-peskine-arm mentioned this pull request Jul 2, 2020

Backport 2.16: Rename Python scripts to use '_' and not '-' #3475

Merged

ronald-cron-arm previously approved these changes Jul 2, 2020

View reviewed changes

gilles-peskine-arm mentioned this pull request Jul 2, 2020

Backport 2.7: Rename Python scripts to use '_' and not '-' #3476

Merged

gilles-peskine-arm requested review from mpg and ronald-cron-arm July 2, 2020 22:32

Documentation improvements

bbb3664

Signed-off-by: Gilles Peskine <[email protected]>

gilles-peskine-arm dismissed stale reviews from ronald-cron-arm and mpg via bbb3664 July 3, 2020 07:33

mpg approved these changes Jul 3, 2020

View reviewed changes

mpg removed the needs-backports Backports are missing or are pending review and approval. label Jul 3, 2020

ronald-cron-arm approved these changes Jul 3, 2020

View reviewed changes

mpg added needs-ci Needs to pass CI tests and removed needs-review Every commit must be reviewed by at least two team members, labels Jul 3, 2020

gilles-peskine-arm added approved Design and code approved - may be waiting for CI or backports and removed needs-ci Needs to pass CI tests labels Jul 3, 2020

gilles-peskine-arm merged commit 2426506 into Mbed-TLS:development Jul 3, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test outcome analysis: check that all available test cases have been executed #3458

Test outcome analysis: check that all available test cases have been executed #3458

gilles-peskine-arm commented Jun 25, 2020 •

edited

Loading

mpg left a comment

gilles-peskine-arm commented Jun 26, 2020

gilles-peskine-arm commented Jun 26, 2020

mpg left a comment

ronald-cron-arm left a comment

ronald-cron-arm Jul 2, 2020

ronald-cron-arm Jul 2, 2020

gilles-peskine-arm commented Jul 2, 2020

mpg commented Jul 3, 2020

gilles-peskine-arm commented Jul 3, 2020

mpg commented Jul 3, 2020

mpg left a comment

ronald-cron-arm left a comment

Test outcome analysis: check that all available test cases have been executed #3458

Test outcome analysis: check that all available test cases have been executed #3458

Conversation

gilles-peskine-arm commented Jun 25, 2020 • edited Loading

mpg left a comment

Choose a reason for hiding this comment

gilles-peskine-arm commented Jun 26, 2020

gilles-peskine-arm commented Jun 26, 2020

mpg left a comment

Choose a reason for hiding this comment

ronald-cron-arm left a comment

Choose a reason for hiding this comment

ronald-cron-arm Jul 2, 2020

Choose a reason for hiding this comment

ronald-cron-arm Jul 2, 2020

Choose a reason for hiding this comment

gilles-peskine-arm commented Jul 2, 2020

mpg commented Jul 3, 2020

gilles-peskine-arm commented Jul 3, 2020

mpg commented Jul 3, 2020

mpg left a comment

Choose a reason for hiding this comment

ronald-cron-arm left a comment

Choose a reason for hiding this comment

gilles-peskine-arm commented Jun 25, 2020 •

edited

Loading