Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tests: automatically run all test files #2810

Merged
merged 11 commits into from
Oct 20, 2023
Merged

Tests: automatically run all test files #2810

merged 11 commits into from
Oct 20, 2023

Conversation

benlorenz
Copy link
Member

@benlorenz benlorenz commented Sep 18, 2023

Add more test files that were not included:

  • OrthogonalDiscriminants: data.jl
  • Groups: pcgroup.jl
  • Rings: special_ideals.jl

Add comments regarding GradedModules.jl and FreeModules-graded.jl, some of this might be obsolete with the ModulesGraded.jl tests?

Remove unused Modules/ReesAlgebra.jl test file, the contents seem to be in Rings/ReesAlgebra.jl now.

Add check that fails when test files are added but not executed, with an ignore list for the graded modules.

@fingolfin
Copy link
Member

Actually Modules/ReesAlgebra.jl and Rings/ReesAlgebra.jl don't have the same content; the former seems to be a subset of the latter. Since you remove the former, that's fine anyway, though :-)

Comment on lines 8 to 11
Main.include("setup.jl")
Main.include("straightline.jl")
Main.include("gap.jl")
Main.include("atlas.jl")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this really work? Whenever you create a new module then Julia installs a custom include method for it of the form include(p) = Base.include(Mod, p) where Mod is the module (here: SLPTest). Source: the docstring for baremodule.

So just calling Main.include doesn't seem likely to support this...?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I did run this locally. It probably defeats the purpose of the submodule right now, but it didn't seem to cause any errors. We could probably pass the module as an extra argument.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK fair enough. I guess as long as this is the last test (and thus cannot affect others tests), and if it really works, then it is fine :-)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it to call include with the correct module now.

I think this was needed mostly for running the doctests after the normal testsuite, which we don't do anymore.

@codecov
Copy link

codecov bot commented Sep 18, 2023

Codecov Report

Merging #2810 (6b94fe0) into master (f9d8294) will increase coverage by 36.67%.
The diff coverage is 86.69%.

@@             Coverage Diff             @@
##           master    #2810       +/-   ##
===========================================
+ Coverage   43.74%   80.42%   +36.67%     
===========================================
  Files         458      466        +8     
  Lines       65036    66231     +1195     
===========================================
+ Hits        28451    53265    +24814     
+ Misses      36585    12966    -23619     
Files Coverage Δ
experimental/FTheoryTools/test/tate.jl 100.00% <ø> (+100.00%) ⬆️
experimental/LieAlgebras/test/LieAlgebra-test.jl 100.00% <ø> (+100.00%) ⬆️
...rimental/LieAlgebras/test/LieAlgebraModule-test.jl 90.90% <ø> (+90.90%) ⬆️
experimental/QuadFormAndIsom/test/runtests.jl 97.29% <ø> (+97.29%) ⬆️
experimental/QuadFormAndIsom/test/setup_tests.jl 100.00% <ø> (ø)
experimental/Experimental.jl 81.81% <66.66%> (+1.81%) ⬆️
experimental/LieAlgebras/test/setup_tests.jl 95.62% <95.62%> (ø)
src/utils/tests.jl 69.76% <71.79%> (+69.76%) ⬆️

... and 322 files with indirect coverage changes

Comment on lines 14 to 19
# FIXME: broken, probably obsolete?
# include("GradedModules.jl")


# TODO: the corresponding source file is also commented
# include("FreeModules-graded.jl")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Theses tests were disabled in 955b60d

I think they are for the "old" graded modules as implemented by @fieker... Dunno if we should keep them, or just remove them... @fieker ?

@fingolfin
Copy link
Member

Rebased to resolve a merge conflict.

@fingolfin
Copy link
Member

This is marked as a draft, so what's the plan here? Do you plan to make more changes? If so, perhaps the fixes could already be applied via a separate PR.

@benlorenz
Copy link
Member Author

I was hoping to get some more feedback on the graded modules, I would prefer to remove them to avoid confusion.

I could also change the code to automatically include all jl files instead of checking the list at the end if this is preferred.

@lgoettgens
Copy link
Member

I really like a list of things where I can easily exclude some test-parts by commenting them out (e.g. for local testing gradually comment out everything that already passed, so that after a failure and a fix one does not have to restart the whole testsuite over and over again)

@fingolfin
Copy link
Member

@benlorenz wrote:

I was hoping to get some more feedback on the graded modules, I would prefer to remove them to avoid confusion.

I'll try to prod @fieker to comment on this

I could also change the code to automatically include all jl files instead of checking the list at the end if this is preferred.

I am in favor. It would also help with parallelization: we could replace the testlist defined in test/runtests.jl by this (and either remove all other runtests.jl files, or just omit them from this "automatic" running of all *.jl files; they would then just be there to make it easier to run all tests in a given directory.... but really this would probably be better solved by teaching our test_module (in src/utils/test.jl) to support this (i.e. allow passing the path to a dir (perhaps relative to the test dir) to that function and it then runs all tests in it.

@lgoettgens wrote:

I really like a list of things where I can easily exclude some test-parts by commenting them out (e.g. for local testing gradually comment out everything that already passed, so that after a failure and a fix one does not have to restart the whole testsuite over and over again)

I hear you, but I think we can achieve this in other ways. E.g. we could have skiplist of testfiles (and even test directories) to be skipped (this might be useful anyway, if we deliberately want to disable a testfile temporarily, e.g. until some bug in Julia nightly is fixed or so).

@lgoettgens
Copy link
Member

And we need to think about files like /test/Serialization/test_save_load_roundtrip.jl that only contains setup code that needs to be run from different test files. Such a file should not be part of this automatic grepping all *.jl files.

@benlorenz benlorenz changed the title Tests: enable more test files that were (accidentally?) not included Tests: automatically run all test files Oct 6, 2023
@benlorenz
Copy link
Member Author

benlorenz commented Oct 9, 2023

The current state here is that the test list is now autogenerated and shuffled. I improved test_module and test_experimental_module in the process to do the following:

  • They accept a path instead of a filename:
    • if path or path.jl is a file then just run this
    • if path is a folder run all test files in that folder and any subfolders
    • if a folder contains a runtests.jl then only this file is run
  • If a folder contains a file called setup_tests.jl this is included before each file directly in that folder.
  • The option timed can be used to have them return a dict mapping filenames to some stats, runtime and allocations (compilation times only for julia >= 1.9).
  • The ignore option can be used to pass suffix-strings or Regex to filter out test files or folders. (There is a global ignore list as well)

Both ignore and timed only work when new=false is passed as it is somewhat impractical to pass this to / from the subprocess.

Currently the ignore list contains:

I removed a bunch of runtests.jl files which are now unnecessary (this also helps with the parallelization). Mostly from the main test folder, I kept most in experimental for now. Edit: Removed several include only runtests.jl files from experimental as well.

The test order is now shuffled (but this should be stable since the global RNG is seeded).
Edit: that seeding does not seem to work properly right now due to the way Distributed works, still looking into how we can avoid that.
Edit2: added more seeding.

@benlorenz
Copy link
Member Author

I have the slight impression that the new ordering might cause more timeouts even on 1.6 and 1.9.
But we do get similar timeouts on other PRs as well now, e.g. here on the Chow ring PR https://github.com/oscar-system/Oscar.jl/actions/runs/6472250065?pr=2905 (I don't think that PR caused the failures).

@benlorenz benlorenz marked this pull request as ready for review October 11, 2023 09:25
@lgoettgens
Copy link
Member

For the Lie algebra test conflict: I think it would be safest to rebase this onto the current master and then copy-paste the conformance test function again. And I am scared that #2572 will touch these again and create more conflicts. If this here gets merged in the near future, I can adapt #2572 accordingly.

@benlorenz benlorenz force-pushed the bl/evenmoretests branch 7 times, most recently from 9c0ab38 to 69ce98a Compare October 13, 2023 12:41
@lgoettgens
Copy link
Member

I think when testing with multiple groups you did something like [group_number:number_of_groups:end] on the sorted list of test files. This splits up similar named files (that live e.g. in the same folder) to different workers -> stuff gets compiled for each of them.
Would it be an idea to try to group consecutive test to the same group? So basically aaaaabbbbbccccc instead of abcabcabcabcabc.

@benlorenz
Copy link
Member Author

I think when testing with multiple groups you did something like [group_number:number_of_groups:end] on the sorted list of test files. This splits up similar named files (that live e.g. in the same folder) to different workers -> stuff gets compiled for each of them. Would it be an idea to try to group consecutive test to the same group? So basically aaaaabbbbbccccc instead of abcabcabcabcabc.

I have a shuffle! a few lines above so the way it is split should not make a difference.
We could try with sorted files and consecutive files per group as well but I think it is preferable to randomize the order of all files to reduce the chance that there are any hidden dependencies between the tests.

For now I want to see how well the small / large separation works. Depending on that I will give other split methods a try as well.

@benlorenz
Copy link
Member Author

I think this is ready now, the long+short split seems to work quite well: both take around 40-60 min for the runtests step (on github actions). So far I have done this split only for Linux where we have plenty of parallel workers (and not for macOS where there are only 5).
Locally the short group took about 2000 seconds, so we might even have a chance that this works during PkgEval (the code automatically chooses the short testgroup if JULIA_PKGEVAL is set).
And running many tests in parallel also works, the whole testsuite was done in about 460 seconds (with 22 jobs).

@benlorenz benlorenz requested a review from fingolfin October 16, 2023 14:22
@lgoettgens
Copy link
Member

#2572 will create another conflict with this PR.
Regardless of the order of merges, I can fix the arising conflicts with some time.

@lgoettgens
Copy link
Member

As this test increases the number of jobs that report coverage data to codecov, the variable after_n_builds in .codecov.yml should be updated accordingly. There are now I think 4 jobs uploading coverage data (ubuntu-1.9-short, ubtuntu-1.9-long, ubuntu-1.9-doctest, macOs-1.9), so 4 should be right.

@benlorenz
Copy link
Member Author

As this test increases the number of jobs that report coverage data to codecov, the variable after_n_builds in .codecov.yml should be updated accordingly. There are now I think 4 jobs uploading coverage data (ubuntu-1.9-short, ubtuntu-1.9-long, ubuntu-1.9-doctest, macOs-1.9), so 4 should be right.

Thanks for noticing this, four uploads would be correct. I just tried to check the report and there are only three uploads here: https://app.codecov.io/gh/oscar-system/Oscar.jl/commit/6b94fe0f4d5478429a39f284bd51d57605d62723

Some random error seems to have occurred in one of the upload steps:

[2023-10-19T08:32:27.704Z] ['info'] Pinging Codecov: https://codecov.io/upload/v4?package=github-action-3.1.4-uploader-0.7.0&token=*******&branch=bl%2Fevenmoretests&build=6571453901&build_url=https%3A%2F%2Fgithub.com%2Foscar-system%2FOscar.jl%2Factions%2Fruns%2F6571453901&commit=6b94fe0f4d5478429a39f284bd51d57605d62723&job=Run+tests&pr=2810&service=github-actions&slug=oscar-system%2FOscar.jl&name=&tag=&flags=&parent=
[2023-10-19T08:32:27.939Z] ['error'] There was an error running the uploader: Error uploading to [https://codecov.io:](https://codecov.io/) Error: There was an error fetching the storage URL during POST: 404 - {'detail': ErrorDetail(string='Unable to locate build via Github Actions API. Please upload with the Codecov repository upload token to resolve issue.', code='not_found')}

I set it to four and if this error happens again one can still check codecov manually or restart the failing job.

Otherwise, from my point of view the code is ready to be merged, and I don't really care if this is done before or after the Lie algebras. We just need to adapt the branch protection rules.

Maybe I can do a little bit of rebase to avoid the squash merge to keep the changes in the test-driver separate from the removal of all the runtests.jl (and some other slight refactorings).

@lgoettgens
Copy link
Member

The Lie algebra PR will hopefully merge in about 20 minutes (once ci is green), so this afterwards would be easier I think.

Some partial squashing sounds like a good idea.

@benlorenz benlorenz mentioned this pull request Oct 20, 2023
move logic from main runtests.jl to src/utils/tests.jl
generate testlist from existing files + folders
improve test_module to allow folders and be more flexible regarding file ending
add timed support for test_module
use global ignore list for test files to skip
run setup_tests.jl before each file in a folder if it exists
shuffle testlist

remove printf
…n runtests.jl

test/Serialization: special case IPC.jl test to make it run on main process only
this is not necessary anymore since the runtests.jl files are mostly gone
(we might lose very few per-file stats)
removing them allows the other jl files to be parallelized and shuffled

- remove obsolete Rings/runtests
- fix running Groups/constructors separately
- slightly reorganize test_save_load
@lgoettgens
Copy link
Member

I think for the LieAlgebra stuff you can do just the same as before (just copy the new version of the conformance tests)

@benlorenz
Copy link
Member Author

benlorenz commented Oct 20, 2023

Rebased to split it into logical chunks. I think these should be fine for a non-squash merge now. (Edit: Clicked the wrong button and squashed it anyway...)

I also removed the changes to experimental/LieAlgebras for now so we can merge this and the other lie-agebras stuff without conflicts.
These can be done later as well: Remove runtests.jl and move code that is needed for multiple files to a common setup_tests.jl, e.g. like in f775375 (modulo whitespace).
(The only disadvantage right now is that the files in that folder can't run in parallel / can't be shuffled)

@benlorenz benlorenz merged commit a770b86 into master Oct 20, 2023
12 of 16 checks passed
@benlorenz benlorenz deleted the bl/evenmoretests branch October 20, 2023 13:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants