Great Lakes Cluster (UMich) #4869

ax3l · 2024-04-17T21:57:45Z

Start documenting how to use the Great Lakes Cluster at University of Michigan.

User guide: https://arc.umich.edu/greatlakes/user-guide

Action Items

complete profile
add install dependency script for missing modules: C-Blosc2, ADIOS2, BLAS++, LAPACK++
complete and test job script template
document small A100 partition and larger CPU partition, too?

archermarx · 2024-04-17T22:28:59Z

Oh great, I can help with this as I have a working install on this cluster.

archermarx · 2024-04-17T22:38:45Z

Here's my profile file, which we can use as a start

# please set your project account
export proj=#####
# remembers the location of this script
export MY_PROFILE=$(cd $(dirname $BASH_SOURCE) && pwd)"/"$(basename $BASH_SOURCE)
if [ -z ${proj-} ]; then echo "WARNING: The 'proj' variable is not yet set in your $MY_PROFILE file! Please edit its line 2 to continue!"; return; fi

# required dependencies
module load cmake
module load gcc
module load openmpi
module load phdf5
module load git
module load cuda
module load python/3.10.4

# compiler environment hints
export CC="$(which gcc)"
export CXX="$(which g++)"
export CUDACXX="$(which nvcc)"
export CUDAHOSTCXX="$(which g++)"
export FC="$(which gfortran)"
export SRC_DIR=${HOME}/src

and here's my install script

# Load required modules
source ~/warpx.profile
module load python/3.10.4

# uninstall old versions
rm -rf build
rm -r *.whl

# Build warpx
cmake -S . -B build \
        -DWarpX_LIB=ON \
        -DWarpX_APP=ON \
        -DWarpX_MPI=ON \
        -DWarpX_COMPUTE=CUDA \
        -DWarpX_DIMS="1;2;RZ;3" \
        -DWarpX_PYTHON=ON \
        -DWarpX_PRECISION=DOUBLE \
        -DWarpX_PARTICLE_PRECISION=SINGLE \
        -DGPUS_PER_SOCKET=4 \
        -DGPUS_PER_NODE=8

cmake --build build -j 8
cmake --build build --target pip_install -j 8

ax3l · 2024-04-18T03:54:32Z

Thank you @archermarx, that is great! I am working to get this documented mainline with Brendan Stassel ✨

Awesome, there is a parallel HDF5 module phdf5 - I must have overlooked that today :) Will update with your recipe included.

Tools/machines/greatlakes-umich/greatlakes_v100_warpx.profile.example

ax3l · 2024-04-18T05:48:31Z

Draft of the docs for testing :)
https://warpx--4869.org.readthedocs.build/en/4869/install/hpc/greatlakes.html

archermarx · 2024-04-18T13:36:23Z

Thank you @archermarx, that is great! I am working to get this documented mainline with Brendan Stassel ✨

Oh nice! I know Brendan

bstassel · 2024-04-18T18:48:56Z

Hey @archermarx! Thanks for sharing your profile and install script

bstassel · 2024-04-18T19:00:21Z

Draft of the docs for testing :) https://warpx--4869.org.readthedocs.build/en/4869/install/hpc/greatlakes.html

@ax3l following the doc you linked, git clone https://github.com/ECP-WarpX/WarpX.git doesn't load a great-lakes folder in the machines dir.

bstassel

c-blosc currently fails due to looking for c-blosc2 dir, then only using c-blosc dir. Either work, but it needs to be consistent.

Tools/machines/greatlakes-umich/install_v100_dependencies.sh

ax3l · 2024-04-18T22:57:45Z

@ax3l following the doc you linked, git clone https://github.com/ECP-WarpX/WarpX.git doesn't load a great-lakes folder in the machines dir.

Oh yes, because this PR is not yet merged. You could do this in ~/src/warpx/:

cd ~/src/warpx

git remote add ax3l https://github.com/ax3l/WarpX.git
git fetch --all
git checkout -b doc-greatlakes-umich ax3l/doc-greatlakes-umich

bstassel · 2024-04-19T20:38:24Z

@ax3l following the doc you linked, git clone https://github.com/ECP-WarpX/WarpX.git doesn't load a great-lakes folder in the machines dir.

Oh yes, because this PR is not yet merged. You could do this in ~/src/warpx/:
cd ~/src/warpx

git remote add ax3l https://github.com/ECP-WarpX/WarpX.git
git checkout -b doc-greatlakes-umich ax3l/doc-greatlakes-umich

I can add the remote git, but when I go to checkout the file it fails.

fatal: 'ax3l/doc-greatlakes-umich' is not a commit and a branch 'doc-greatlakes-umich' cannot be created from it

ax3l · 2024-04-20T17:05:21Z

Sorry, I forgot to write also:

git fetch --all

and posted the wrong git URL. Editing now above msg.

Tools/machines/greatlakes-umich/install_v100_dependencies.sh

bstassel · 2024-04-23T17:37:36Z

Docs/source/install/hpc/greatlakes.rst

+      .. code-block:: bash
+
+         bash $HOME/src/warpx/Tools/machines/greatlakes-umich/install_v100_dependencies.sh
+         source ${HOME}/sw/greatlakes/v100/venvs/warpx-v100/bin/activate


Above the guide informs the user to always source $HOME/greatlakes_v100_warpx.profile.

Is activate copied to the greatlakes_v100_warpx.profile or are we loading a different source here? If so, why?

This extra line is only needed once, as we set up the dependencies, to continue in the same terminal.

The reason for that extra line in this step is that we already sourced the profile but only the line now adds the venv - so it was not yet activated.

ax3l · 2024-04-24T20:13:04Z

Status: We are still working on the job script, to ensure the GPU visibility is correct set (one unique GPU per one MPI rank pinned on the closest CPU). @bstassel opened a ticket to support for this.

@archermarx what job script template are you using for the V100 GPUs? Did you solve this already?

archermarx · 2024-04-24T20:17:51Z

@ax3l This job script appeared to work for me for running a 2-GPU job after some discussion with ARC-TS

#!/bin/bash
#SBATCH --job-name=#####
#SBATCH --account=#####
#SBATCH --partition=gpu
#SBATCH --nodes=1
#SBATCH --ntasks=2
#SBATCH --cpus-per-gpu=1
#SBATCH --gpus-per-task=v100:1
#SBATCH --gpu-bind=single:1
#SBATCH --mem=32000m
#SBATCH --time=00:05:00
#SBATCH --output=output.log
#SBATCH --mail-type=END,FAIL

# Load required modules
module load gcc hdf5 openmpi/4.1.6-cuda cmake git cuda python/3.10.4

srun python PICMI_inputs_2d.py

Tools/machines/greatlakes-umich/greatlakes_v100.sbatch

ax3l · 2024-04-24T20:56:07Z

Oh,

#SBATCH --gpu-bind=single:1

could be what I missed.

bstassel · 2024-04-24T22:17:45Z

Oh,
#SBATCH --gpu-bind=single:1
could be what I missed.

I will give this a try and report the result.

EDIT: this resolved the error thrown by WarpX about MPI mapping

bstassel · 2024-04-24T23:42:30Z

For future reference, the complete documentation for SLURM is here: https://slurm.schedmd.com/srun.html#OPT_gres-flags

There is an interesting combination between --gpu-bind=single:<numtasks> and --gres-flags=allow-task-sharing that allows each task to see each GPU within the job allocation that is on the same node as the task, which allows for inter-GPU communication.

bstassel · 2024-04-25T00:55:46Z

I want to make sure I comprehend the SBATCH commands used in greatlakes_v100.sbatch.

#SBATCH -N 1 -> we request only 1 Node 
#SBATCH --exclusive -> do not allow SLURM to share the node it gives me with other users
#SBATCH --ntasks-per-node=2 -> limit only two processes to run on the 1 node we requested (I assume because there are 2x 2.4 GHz Intel Xeon Gold 6148 for each node)
#SBATCH --cpus-per-task=20 -> Allocate 20 cpus per processes (for a total of 40 CPUs on the node, the max for a node on the GreatLakes gpu partition)
#SBATCH --gpus-per-task=v100:1 -> Give each process its own V100 GPU, for a total of 2 GPUS
#SBATCH --gpu-bind=single:1 -> 1 process is bound to 1 GPU

Do I have that right?

bstassel · 2024-04-25T01:41:52Z

I was comparing the output.txt file between the job that had --gpu-bind=single:1 and the one that didn't. It appears forcing 1 GPU to 1 MPI rank gives worse performance? Maybe this is only because the simulation is so short, I don't see the benefits.

No gpu-bind

STEP 100 starts ...
--- INFO    : re-sorting particles
--- INFO    : Writing openPMD file diags/diag1000100
--- INFO    : Writing openPMD file diags/openPMDfw000100
--- INFO    : Writing openPMD file diags/openPMDbw000100
STEP 100 ends. TIME = 1.083064693e-14 DT = 1.083064693e-16
Evolve time = 2.07016534 s; This step = 0.346136823 s; Avg. per step = 0.0207016534 s


**** WARNINGS ******************************************************************
* GLOBAL warning list  after  [ THE END ]
*
* No recorded warnings.
********************************************************************************

Total Time                     : 4.078974041


TinyProfiler total time across processes [min...avg...max]: 4.08 ... 4.088 ... 4.096

`--gpu-bind=single:1'

STEP 100 starts ...
--- INFO    : re-sorting particles
--- INFO    : Writing openPMD file diags/diag1000100
--- INFO    : Writing openPMD file diags/openPMDfw000100
--- INFO    : Writing openPMD file diags/openPMDbw000100
STEP 100 ends. TIME = 1.083064693e-14 DT = 1.083064693e-16
Evolve time = 8.21666101 s; This step = 1.493229092 s; Avg. per step = 0.0821666101 s


**** WARNINGS ******************************************************************
* GLOBAL warning list  after  [ THE END ]
*
* No recorded warnings.
********************************************************************************

Total Time                     : 16.89200922


TinyProfiler total time across processes [min...avg...max]: 16.89 ... 16.9 ... 16.91

archermarx · 2024-04-25T01:51:53Z

Do you have Amex GPU aware mpi enabled?

…

On Wed, Apr 24, 2024, 9:42 PM Brendan ***@***.***> wrote: I was comparing the output.txt file between the job that had --gpu-bind=single:1 and the one that didn't. It appears forcing 1 GPU to 1 MPI rank gives worse performance? Maybe this is only because the simulation is so short, I don't see the benefits. *No gpu-bind* STEP 100 starts ... --- INFO : re-sorting particles --- INFO : Writing openPMD file diags/diag1000100 --- INFO : Writing openPMD file diags/openPMDfw000100 --- INFO : Writing openPMD file diags/openPMDbw000100 STEP 100 ends. TIME = 1.083064693e-14 DT = 1.083064693e-16 Evolve time = 2.07016534 s; This step = 0.346136823 s; Avg. per step = 0.0207016534 s **** WARNINGS ****************************************************************** * GLOBAL warning list after [ THE END ] * * No recorded warnings. ******************************************************************************** Total Time : 4.078974041 TinyProfiler total time across processes [min...avg...max]: 4.08 ... 4.088 ... 4.096 `--gpu-bind=single:1' STEP 100 starts ... --- INFO : re-sorting particles --- INFO : Writing openPMD file diags/diag1000100 --- INFO : Writing openPMD file diags/openPMDfw000100 --- INFO : Writing openPMD file diags/openPMDbw000100 STEP 100 ends. TIME = 1.083064693e-14 DT = 1.083064693e-16 Evolve time = 8.21666101 s; This step = 1.493229092 s; Avg. per step = 0.0821666101 s **** WARNINGS ****************************************************************** * GLOBAL warning list after [ THE END ] * * No recorded warnings. ******************************************************************************** Total Time : 16.89200922 TinyProfiler total time across processes [min...avg...max]: 16.89 ... 16.9 ... 16.91 — Reply to this email directly, view it on GitHub <#4869 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AOP3QGT2YJ676ZXVQPEHNR3Y7BNPPAVCNFSM6AAAAABGMDSSQ6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANZWGE3TGOJTGI> . You are receiving this because you were mentioned.Message ID: ***@***.***>

ax3l · 2024-04-25T03:11:25Z

Awesome progress!

Starting to answer individual Qs from above.

#4869 (comment)

There is an interesting combination between --gpu-bind=single: and --gres-flags=allow-task-sharing that allows each task to see each GPU within the job allocation that is on the same node as the task, which allows for inter-GPU communication.

The last part, not exactly. We can do inter-GPU communication with MPI direct.
We want indeed only one GPU visible per task (aka MPI rank), so they have a 1:1 relation. We use GPU-aware MPI to do direct GPU-to-GPU communication.

#4869 (comment)

#SBATCH --ntasks-per-node=2 -> limit only two processes to run on the 1 node we requested (I assume because there are 2x 2.4 GHz Intel Xeon Gold 6148 for each node)

All correct. This in particular just says: we want to MPI processes per 1 node. The reason for that is that we have 2 GPUs per node and want a 1:1 mapping. (We would do the same if it was only one Intel Xeon Gold per node, because they are multi-core CPUs anyway.)

#4869 (comment)

I was comparing the output.txt file between the job that had --gpu-bind=single:1 and the one that didn't. It appears forcing 1 GPU to 1 MPI rank gives worse performance? Maybe this is only because the simulation is so short, I don't see the benefits.

That is totally ok.
The reason is likely that your simulation is to small (not too short). Now, it probably cuts your simulation in domain decomposition in so small pieces, that every GPU is barely busy and mostly spends time talking to other GPUs.

Solution: use less GPUs or solve a bigger problem :)

General guidance for 16 GB V100 GPUs: try to have about 128^3 to 256^3 cells per GPU, as fits with your number of particles per cell.

Looks like the GPU bindings work! 🎉

bstassel · 2024-04-25T03:12:34Z

Do you have Amex GPU aware mpi enabled?

Yeah, I think so.

# GPU-aware MPI optimizations
GPU_AWARE_MPI="amrex.use_gpu_aware_mpi=1"

Document how to use the Great Lakes Cluster at the University of Michigan.

New module added on the system.

Ensure one MPI rank sees exactly one, unique GPU.

archermarx · 2024-04-25T03:28:32Z

I'll give this a try tomorrow!

…

On Wed, Apr 24, 2024, 11:27 PM Axel Huebl ***@***.***> wrote: @ax3l <https://github.com/ax3l> requested your review on: #4869 <#4869> Great Lakes Cluster (UMich). — Reply to this email directly, view it on GitHub <#4869 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AOP3QGVWZB6USO6BQITXBD3Y7BZZXAVCNFSM6AAAAABGMDSSQ6VHI2DSMVQWIX3LMV45UABCJFZXG5LFIV3GK3TUJZXXI2LGNFRWC5DJN5XDWMJSGYYDAOBTGM4DGMA> . You are receiving this because your review was requested.Message ID: ***@***.***>

roelof-groenewald

Looks good to me!

ax3l · 2024-04-25T06:48:27Z

@archermarx thanks a lot!!

I will preliminarily merge this to development, so the doc render and refresh and you can use the RTD page without having to switch branches:
https://warpx.readthedocs.io/en/latest/install/hpc/greatlakes.html

Please report back here if this worked - otherwise we can do a follow-up PR.

bstassel · 2024-04-25T18:26:57Z

Tools/machines/greatlakes-umich/greatlakes_v100.sbatch

+#SBATCH --cpus-per-task=20
+#SBATCH --gpus-per-task=v100:1
+#SBATCH --gpu-bind=single:1
+#SBATCH -o WarpX.o%j


Recommend expliciting putting
#SBATCH --mem=0
to signify this request is allocating all the memory on the node. This should happen dynamically, since --exclusive is set but i think for users it is a good idea to put so they have some reference to what is implicitly happening with the job request

Wait, --mem=0 means all? o.0
I think --exclusive is a bit clearer for now / avoids duplication, unless it does not work to reserve all host memory.

* Great Lakes Cluster (UMich) Document how to use the Great Lakes Cluster at the University of Michigan. * Fix c-blosc2 typos * Parallel HDF5 for CUDA-aware MPI New module added on the system. * Fix GPU Visibility Ensure one MPI rank sees exactly one, unique GPU. * Add `#SBATCH --gpu-bind=single:1` * Add clean-up message.

ax3l added component: documentation Docs, readme and manual machine / system Machine or system-specific issue labels Apr 17, 2024

ax3l force-pushed the doc-greatlakes-umich branch from 3a14f3c to bd7d7f7 Compare April 17, 2024 22:00

ax3l force-pushed the doc-greatlakes-umich branch from bd7d7f7 to 5498e2b Compare April 18, 2024 04:56

ax3l commented Apr 18, 2024

View reviewed changes

Tools/machines/greatlakes-umich/greatlakes_v100_warpx.profile.example Outdated Show resolved Hide resolved

ax3l changed the title ~~[WIP] Great Lakes Cluster (UMich)~~ Great Lakes Cluster (UMich) Apr 18, 2024

bstassel suggested changes Apr 18, 2024

View reviewed changes

Tools/machines/greatlakes-umich/install_v100_dependencies.sh Outdated Show resolved Hide resolved

Tools/machines/greatlakes-umich/install_v100_dependencies.sh Outdated Show resolved Hide resolved

ax3l marked this pull request as ready for review April 22, 2024 22:05

ax3l force-pushed the doc-greatlakes-umich branch from 40b071b to b80a6e0 Compare April 22, 2024 22:14

ax3l assigned roelof-groenewald Apr 22, 2024

bstassel suggested changes Apr 23, 2024

View reviewed changes

Tools/machines/greatlakes-umich/install_v100_dependencies.sh Show resolved Hide resolved

ax3l mentioned this pull request Apr 23, 2024

adios-campaign directory ornladios/ADIOS2#4148

Open

bstassel reviewed Apr 23, 2024

View reviewed changes

ax3l commented Apr 24, 2024

View reviewed changes

Tools/machines/greatlakes-umich/greatlakes_v100.sbatch Show resolved Hide resolved

ax3l added 6 commits April 24, 2024 20:16

Great Lakes Cluster (UMich)

dd94cf0

Document how to use the Great Lakes Cluster at the University of Michigan.

Fix c-blosc2 typos

f5a89f0

Parallel HDF5 for CUDA-aware MPI

dd4acc7

New module added on the system.

Fix GPU Visibility

b42138b

Ensure one MPI rank sees exactly one, unique GPU.

Add #SBATCH --gpu-bind=single:1

5730f8c

Add clean-up message.

3b0fbd6

ax3l force-pushed the doc-greatlakes-umich branch from 3997ba7 to 3b0fbd6 Compare April 25, 2024 03:16

bstassel approved these changes Apr 25, 2024

View reviewed changes

ax3l requested review from archermarx and roelof-groenewald and removed request for archermarx April 25, 2024 03:27

roelof-groenewald approved these changes Apr 25, 2024

View reviewed changes

ax3l merged commit ed7e824 into ECP-WarpX:development Apr 25, 2024
43 of 45 checks passed

ax3l deleted the doc-greatlakes-umich branch April 25, 2024 06:48

bstassel reviewed Apr 25, 2024

View reviewed changes

ax3l mentioned this pull request Apr 29, 2024

Doc: Great Lakes ADIOS2 Update #4905

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Great Lakes Cluster (UMich) #4869

Great Lakes Cluster (UMich) #4869

ax3l commented Apr 17, 2024 •

edited

Loading

archermarx commented Apr 17, 2024

archermarx commented Apr 17, 2024

ax3l commented Apr 18, 2024

ax3l commented Apr 18, 2024

archermarx commented Apr 18, 2024

bstassel commented Apr 18, 2024

bstassel commented Apr 18, 2024 •

edited

Loading

bstassel left a comment •

edited by ax3l

Loading

ax3l commented Apr 18, 2024 •

edited

Loading

bstassel commented Apr 19, 2024

ax3l commented Apr 20, 2024 •

edited

Loading

bstassel Apr 23, 2024

ax3l Apr 24, 2024 •

edited

Loading

ax3l commented Apr 24, 2024

archermarx commented Apr 24, 2024 •

edited

Loading

ax3l commented Apr 24, 2024

bstassel commented Apr 24, 2024 •

edited

Loading

bstassel commented Apr 24, 2024 •

edited

Loading

bstassel commented Apr 25, 2024 •

edited

Loading

bstassel commented Apr 25, 2024

archermarx commented Apr 25, 2024 via email

ax3l commented Apr 25, 2024 •

edited

Loading

bstassel commented Apr 25, 2024

archermarx commented Apr 25, 2024 via email

roelof-groenewald left a comment

ax3l commented Apr 25, 2024 •

edited

Loading

bstassel Apr 25, 2024 •

edited

Loading

ax3l Apr 29, 2024 •

edited

Loading

Great Lakes Cluster (UMich) #4869

Great Lakes Cluster (UMich) #4869

Conversation

ax3l commented Apr 17, 2024 • edited Loading

Action Items

archermarx commented Apr 17, 2024

archermarx commented Apr 17, 2024

ax3l commented Apr 18, 2024

ax3l commented Apr 18, 2024

archermarx commented Apr 18, 2024

bstassel commented Apr 18, 2024

bstassel commented Apr 18, 2024 • edited Loading

bstassel left a comment • edited by ax3l Loading

Choose a reason for hiding this comment

ax3l commented Apr 18, 2024 • edited Loading

bstassel commented Apr 19, 2024

ax3l commented Apr 20, 2024 • edited Loading

bstassel Apr 23, 2024

Choose a reason for hiding this comment

ax3l Apr 24, 2024 • edited Loading

Choose a reason for hiding this comment

ax3l commented Apr 24, 2024

archermarx commented Apr 24, 2024 • edited Loading

ax3l commented Apr 24, 2024

bstassel commented Apr 24, 2024 • edited Loading

bstassel commented Apr 24, 2024 • edited Loading

bstassel commented Apr 25, 2024 • edited Loading

bstassel commented Apr 25, 2024

archermarx commented Apr 25, 2024 via email

ax3l commented Apr 25, 2024 • edited Loading

bstassel commented Apr 25, 2024

archermarx commented Apr 25, 2024 via email

roelof-groenewald left a comment

Choose a reason for hiding this comment

ax3l commented Apr 25, 2024 • edited Loading

bstassel Apr 25, 2024 • edited Loading

Choose a reason for hiding this comment

ax3l Apr 29, 2024 • edited Loading

Choose a reason for hiding this comment

ax3l commented Apr 17, 2024 •

edited

Loading

bstassel commented Apr 18, 2024 •

edited

Loading

bstassel left a comment •

edited by ax3l

Loading

ax3l commented Apr 18, 2024 •

edited

Loading

ax3l commented Apr 20, 2024 •

edited

Loading

ax3l Apr 24, 2024 •

edited

Loading

archermarx commented Apr 24, 2024 •

edited

Loading

bstassel commented Apr 24, 2024 •

edited

Loading

bstassel commented Apr 24, 2024 •

edited

Loading

bstassel commented Apr 25, 2024 •

edited

Loading

ax3l commented Apr 25, 2024 •

edited

Loading

ax3l commented Apr 25, 2024 •

edited

Loading

bstassel Apr 25, 2024 •

edited

Loading

ax3l Apr 29, 2024 •

edited

Loading