From 51bca17615db368bcd04c6f810f1934c2af80145 Mon Sep 17 00:00:00 2001 From: casparvl casparvl Date: Fri, 1 Mar 2024 16:34:06 +0000 Subject: [PATCH] Add some documentation on how to debug the test step implemented in https://github.com/EESSI/software-layer/pull/467 --- .../debugging_failed_builds.md | 30 +++++++++++++++++++ 1 file changed, 30 insertions(+) diff --git a/docs/adding_software/debugging_failed_builds.md b/docs/adding_software/debugging_failed_builds.md index af612a511..336784726 100644 --- a/docs/adding_software/debugging_failed_builds.md +++ b/docs/adding_software/debugging_failed_builds.md @@ -190,6 +190,36 @@ After some time, this build fails while trying to build `Plumed`, and we can acc !!! Note While this might be faster than the EasyStack-based approach, this is _not_ how the bot builds. So why it _may_ reproduce the failure the bot encounters, it may not reproduce the bug _at all_ (no failure) or run into _different_ bugs. If you want to be sure, use the EasyStack-based approach. +## Running the test step +If you are still in the prefix layer (i.e. after previously building something), exit it first: +``` +$ exit +logout +Leaving Gentoo Prefix with exit status 0 +``` +Then, source the EESSI init script (again): +``` +Apptainer> source ${EESSI_CVMFS_REPO}/versions/${EESSI_VERSION}/init/bash +Environment set up to use EESSI (2023.06), have fun! +{EESSI 2023.06} Apptainer> +``` + +!!! Note + If you are in a SLURM environment, make sure to run `for i in $(env | grep SLURM); do unset "${i%=*}"; done` to unset any SLURM environment variables. Failing to do so will cause `mpirun` to pick up on these and e.g. infer how many slots are available. If you run into errors of the form "There are not enough slots available in the system to satisfy the X slots that were requested by the application:", you probably forgot this step. + +Then, execute the `run_tests.sh` script. We are assuming you are still in the root of the `software-layer` repository that you cloned earlier: +``` +./run_tests.sh +``` +if all goes well, you should see (part of) the EESSI test suite being run by ReFrame, finishing with something like + +``` +[ PASSED ] Ran X/Y test case(s) from Z check(s) (0 failure(s), 0 skipped, 0 aborted) +``` + +!!! Note + If you are running on a system with hyperthreading enabled, you may still run into the "There are not enough slots available in the system to satisfy the X slots that were requested by the application:" error from `mpirun`, because hardware threads are not considered to be slots by default by OpenMPIs `mpirun`. In this case, run with `OMPI_MCA_hwloc_base_use_hwthreads_as_cpus=1./run_tests.sh` (for OpenMPI 4.X) or `PRTE_MCA_rmaps_default_mapping_policy=:hwtcpus ./run_tests.sh` (for OpenMPI 5.X). + ## Known causes of issues in EESSI ### The custom system prefix of the compatibility layer