Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added information in README.md on how to create a ReFrame configuration file for the software-layer test step #293

Merged
merged 2 commits into from
Feb 3, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
235 changes: 232 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -765,7 +765,236 @@ moved_job_dirs_comment = PR merged! Moved `{job_dirs}` to `{trash_bin_dir}`
Template that is used by the bot to add a comment to a PR noting down which directories have been
moved and where.

# Instructions to run the bot components
# Step 6: Creating a ReFrame configuration file for the test step (only needed when building for the [EESSI software layer](https://github.com/EESSI/software-layer))
Part of the test step of the EESSI software layer is running the EESSI test suite. This requires putting a ReFrame configuration file in place that describes the partitions in the `arch_target_map` of the bot config.

You can find general documentation on how to write a ReFrame config file in the [EESSI documentation](https://www.eessi.io/docs/test-suite/ReFrame-configuration-file/). However, some specifics apply when setting things up for the test step:

- The configuration file has to be in `{shared_fs_path}/reframe_config.py` (recommended) or you have to set `RFM_CONFIG_FILES` to point to the configuration file and you have to make sure that is a location that is available (mounted) in the build container.
- The system name _has_ to be `BotBuildTests`
- Partition names should be ${EESSI_SOFTWARE_SUBDIR//\//_} for non-accelerator partitions and ${EESSI_SOFTWARE_SUBDIR//\//_}_${EESSI_ACCELERATOR_TARGET//\//_} for accelerator partitions. In words: the partition name should be the software subdir, replacing slashes with underscores, and for accelerators appending the accelerator target (again replacing slashes with underscores). E.g. x86_64_intel_skylake_avx512_nvidia_cc80 would be a valid partition name for a partition with Intel skylake's + Nvidia A100s.\
- The `scheduler` should be `local`, as the bot already schedules the job (ReFrame should just locally spawn the tests in the allocation created by the bot).
- The `access` field should not be used by ReFrame if the local scheduler is defined, you can simply omit this keyword.

To configure the number of GPUs and CPUs, we have two options:
1. We describe the physical node in the ReFrame configuration file and set the `REFRAME_SCALE_TAG` environment variable to match the size of the allocation that you specify in your bot config. E.g. if your bot config allocates 1/4th of a node, one would set `REFRAME_SCALE_TAG=1_4_node` in the environment of the job submitted by the bot.
2. We describe a virtual node configuration that matches the size of the allcation created by the bot (and we use the default `REFRAME_SCALE_TAG=1_node`, you don't have to set this explicitely).

The first approach is the easiest, and thus recommended, since you can use CPU autodetection by ReFrame. The second approach allows for more flexibility.

## Approach 1 (recommended): describing the physical node and setting the `REFRAME_SCALE_TAG` to match the bot config's allocation size
In this approach, we describe the physical node configuration. That means: the amount of physical CPUs and GPUs present in the node.

For the CPU part, we can rely on ReFrame's CPU autodetection: if the local spawner is configured, and no CPU topology information is provided in the ReFrame configuration file, ReFrame will automatically detect the [CPU topology](https://reframe-hpc.readthedocs.io/en/stable/config_reference.html#config.systems.partitions.processor).

For the GPU part, we need to configure the vendor and the amount of GPUs. E.g. for a partition with 4 Nvidia GPUs per node:
```
'partition': {
...
'extras': {
GPU_VENDOR: GPU_VENDORS[NVIDIA],
},
'devices': [
{
'type': DEVICE_TYPES[GPU],
'num_devices': 4,
}
]
}
```

Now, we need to make sure ReFrame only starts tests that have scales that fit within the allocation created by the bot. E.g. on a GPU node, it would be quite common to only allocate a single GPU for building GPU software. In the above example, that means only a quarter node. We can make sure the EESSI test suite only runs tests that fit within a 25% of the physical node described above by making sure the `REFRAM_SCALE_TAG` environment variable is set to `1_4_node`. You can find a list of all valid values for the `REFRAME_SCALE_TAG` by checking the `SCALES` constant in the [EESSI test suite](https://github.com/EESSI/test-suite/blob/main/eessi/testsuite/constants.py).

Note that if you had e.g. a node with 6 GPUs per node, and you were building on 1 GPU, you probably want to go for Approach 2, since `1_6_node` is not a known scale in the EESSI test suite. Although you could set `REFRAME_SCALE_TAG=1_8_node`, this would lead to undefined behavior for the amount of GPUs allocated (may be 1, may be 0). For CPU-based nodes, this could however be a reasonable approach.

Note that if for _some_ partitions you use e.g. quarter nodes, and for some full nodes, you'll have to set the `REFRAME_SCALE_TAG` conditionally based on the node architecture. You could e.g. do this in a `.bashrc` that has some conditional logic to determine the node type and set the corresponding scale. Alternatively, you could use Approach 2.

### Complete example config
In this example, we assume a node with 4 A100 GPUs (compute capability `cc80`) and 72 CPU cores (Intel Skylake) and 512 GB of memory (of which 491520 MiB is useable by SLURM jobs; on this system the rest is reserved for the OS):
```
from eessi.testsuite.common_config import common_logging_config
from eessi.testsuite.constants import * # noqa: F403


site_configuration = {
'systems': [
{
'name': 'BotBuildTests', # The system HAS to have this name, do NOT change it
'descr': 'Software-layer bot',
'hostnames': ['.*'],
'modules_system': 'lmod',
'partitions': [
{
'name': 'x86_64_intel_skylake_avx512_nvidia_cc80',
'scheduler': 'local',
'launcher': 'mpirun',
'environs': ['default'],
'features': [
FEATURES[GPU] # We want this to run GPU-based tests from the EESSI test suite
] + list(SCALES.keys()),
'resources': [
{
'name': 'memory',
'options': ['--mem={size}'],
}
],
'extras': {
# Make sure to round down, otherwise a job might ask for more mem than is available
# per node
'mem_per_node': 491520, # in MiB (512 GB minus some reserved for the OS)
GPU_VENDOR: GPU_VENDORS[NVIDIA],
},
'devices': [
{
'type': DEVICE_TYPES[GPU],
'num_devices': 4,
}
],
'max_jobs': 1
},
]
}
],
'environments': [
{
'name': 'default',
'cc': 'cc',
'cxx': '',
'ftn': ''
}
],
'general': [
{
'purge_environment': True,
'resolve_module_conflicts': False, # avoid loading the module before submitting the job
}
],
'logging': common_logging_config(),
}
```

## Approach 2: describing a virtual node
In this approach, we describe a virtual node configuration for which the size matches exactly what is allocated by the bot (through the `slurm_params` and `arch_target_map`). In this example, we'll assume that this node has 4 GPUs and 72 cores, distributed over 2 sockets each consisting of 1 NUMA domain. We also assume our bot is configured with `slurm_params = --hold --nodes=1 --export=None --time=0:30:0` and `arch_target_map = {"linux/x86_64/intel/skylake_avx512" : "--partition=gpu --cpus-per-task=18 --gpus-per-node 1"}`, i.e. it effectively allocates a quarter node. We describe a virtual partition for ReFrame as if this quarter node is a full node, i.e. we pretend it is a partition with 18 cores and 1 GPU per node, with 1 socket.

We would first have to hardcode the CPU configuration.
```
'partition': {
...
'processor': {
"num_cpus": 18,
"num_cpus_per_core": 1,
"num_cpus_per_socket": 18,
"num_sockets": 1,
"topology": {
"numa_nodes": [
# As stated, the 18 cores are on a single NUMA domain. Thus, the bitmask should be a sequence of 18 1's, which is 3ffff in hexadecimal representation
"0x3ffff", # a bit mask of 111111111111111111, i.e. cores 0-17 are on this NUMA domain
],
},
}
}
```

Note that if instead, this node would have had 8 NUMA domains (4 per socket), the 18 cores would correspond to 2 NUMA domains and we would have had to define:
```
"numa_nodes": [
"0x001ff", # a bit mask of 000000000111111111, i.e. cores 0-8 are on this NUMA domain
"0x3fe00", # a bit mask of 111111111000000000, i.e. cores 9-17 are on this NUMA domain
]
```

Note that the `topology` dictionary in a ReFrame configuration file can contain more information, such as the bitmasks for the CPU sockets and cores, as well as information on the caches (see [here](https://reframe-hpc.readthedocs.io/en/stable/config_reference.html#config.systems.partitions.processor.topology)). Currently, that information is not needed by the EESSI test suite, but that may change if tests are added that utilize such information to execute efficiently.

For the GPU configuration, we simply put:
```
'partition': {
...
'extras': {
GPU_VENDOR: GPU_VENDORS[NVIDIA],
},
'devices': [
{
'type': DEVICE_TYPES[GPU],
'num_devices': 1,
}
]
}
```
To match the fact that we allocate 1 GPU in the `arch_target_map`.

### Complete example config
In this example, we assume a node with 4 A100 GPUs (compute capability `cc80`) and 72 CPU cores (Intel Skylake) and 512 GB of memory (of which 491520 MiB is useable by SLURM jobs; on this system the rest is reserved for the OS). We also assume the bot configuration is such for this partition that 1/4th of these nodes gets allocated for a build job:
```
site_configuration = {
'systems': [
{
'name': 'BotBuildTests', # The system HAS to have this name, do NOT change it
'descr': 'Software-layer bot',
'hostnames': ['.*'],
'modules_system': 'lmod',
'partitions': [
{
'name': 'x86_64_intel_skylake_avx512_nvidia_cc80',
'scheduler': 'local',
'launcher': 'mpirun',
'environs': ['default'],
'features': [
FEATURES[GPU] # We want this to run GPU-based tests from the EESSI test suite
] + list(SCALES.keys()),
'resources': [
{
'name': 'memory',
'options': ['--mem={size}'],
}
],
'extras': {
# Make sure to round down, otherwise a job might ask for more mem than is available
# per node
'mem_per_node': 122880, # in MiB (1/4th of 491520 MiB)
GPU_VENDOR: GPU_VENDORS[NVIDIA],
},
'devices': [
{
'type': DEVICE_TYPES[GPU],
'num_devices': 1,
}
],
'processor': {
"num_cpus": 18,
"num_cpus_per_core": 1,
"num_cpus_per_socket": 18,
"num_sockets": 1,
"topology": {
"numa_nodes": [
# As stated, the 18 cores are on a single NUMA domain. Thus, the bitmask should be a sequence of 18 1's, which is 3ffff in hexadecimal representation
"0x3ffff",
],
},
},
'max_jobs': 1
},
]
}
],
'environments': [
{
'name': 'default',
'cc': 'cc',
'cxx': '',
'ftn': ''
}
],
'general': [
{
'purge_environment': True,
'resolve_module_conflicts': False, # avoid loading the module before submitting the job
}
],
'logging': common_logging_config(),
}
```

# Step 7: Instructions to run the bot components

The bot consists of three components:
* the Smee client;
Expand All @@ -774,7 +1003,7 @@ The bot consists of three components:

Running the Smee client was explained in [Step 1](#step1).

## <a name="step6.1"></a>Step 6.1: Running the event handler
## <a name="step7.1"></a>Step 7.1: Running the event handler
As the event handler may run for a long time, it is advised to run it in a `screen` or `tmux` session.

The event handler is provided by the [`eessi_bot_event_handler.py`](https://github.com/EESSI/eessi-bot-software-layer/blob/main/eessi_bot_event_handler.py) Python script.
Expand All @@ -797,7 +1026,7 @@ The event handler writes log information to the files `pyghee.log` and

Note, if you run the bot on a frontend of a cluster with multiple frontends make sure that both the Smee client and the event handler run on the same system!

## <a name="step6.2"></a>Step 6.2: Running the job manager
## <a name="step7.2"></a>Step 7.2: Running the job manager
As the job manager may run for a long time, it is advised to run it in a `screen` or `tmux` session.

The job manager is provided by the [`eessi_bot_job_manager_layer.py`](https://github.com/EESSI/eessi-bot-software-layer/blob/main/eessi_bot_job_manager.py) Python script. You can run the job manager from the directory `eessi-bot-software-layer` simply by:
Expand Down