-
Notifications
You must be signed in to change notification settings - Fork 1
clima
clima.gps.caltech.edu
is a GPU node with 8x NVIDIA A100 GPUs.
Email [email protected]
and request access
Unlike central, clima
has a handful of modules available. The recommended approach is to install in your home directory.
Add to your local ~/.ssh/config
file
Host clima
HostName clima.gps.caltech.edu
User [username]
To access from outside the network, either use the Caltech VPN
Match final host !ssh.caltech.edu,*.caltech.edu !exec "nc -z -G 1 login.hpc.caltech.edu 22"
ProxyJump ssh.caltech.edu
-
/home/[username]
(capped at 1TB): mounted fromsampo
, and is backed up -
/net/sampo/data1
(200TB): mounted fromsampo
. Not backed up, but somewhat protected by redundant RAID partition -
/scratch
(70TB): fast SSD, not backed up and no RAID redundancy
top
clima
has 8×NVIDIA 80GB A100 GPUs, connected via NVlink.
-
nvidia-smi
gives a summary of all the GPUs-
nvidia-smi topo -m
shows the connections between GPUs and CPUs
-
-
nvtop
gives you a live-refresh of current GPU usage
While GPUs can be used directly, it is always recommended to schedule jobs using SLURM. This prevents allocation of multiple jobs on the same GPU, which can cause significant performance degradation.
It has a single-node installation of slurm.
We have set up a common environment. You can load this by
module load common
This will set the appropriate Julia preferences, so you should not need to e.g. call MPIPreferences.use_system_binary()
.