Skip to content
Julia Sloan edited this page Jan 16, 2025 · 10 revisions is a GPU node with 8x NVIDIA A100 GPUs.

Getting access

Email [email protected] and request access

Setting up

Unlike central, clima has a handful of modules available. The recommended approach is to install in your home directory.

SSH config

Add to your local ~/.ssh/config file

Host clima
  User [username]

To access from outside the network, either use the Caltech VPN

Match final host !,* !exec "nc -z -G 1 22"

About the machine


  • /home/[username] (capped at 1TB): mounted from sampo, and is backed up
  • /net/sampo/data1 (200TB): mounted from sampo. Not backed up, but somewhat protected by redundant RAID partition
  • /scratch (70TB): fast SSD, not backed up and no RAID redundancy

CPU usage

  • top


clima has 8×NVIDIA 80GB A100 GPUs, connected via NVlink.

  • nvidia-smi gives a summary of all the GPUs
    • nvidia-smi topo -m shows the connections between GPUs and CPUs
  • nvtop gives you a live-refresh of current GPU usage


It has a single-node installation of slurm.

We have set up a common environment. You can load this by

module load common

which currently loads

openmpi/4.1.5-cuda julia/1.9.3 cuda/julia-pref

This will set the appropriate Julia preferences, so you should not need to e.g. call MPIPreferences.use_system_binary().

Usage etiquette

Please avoid using clima for long-running CPU-only jobs. The Resnick HPC cluster is better for that.

While GPUs can be used directly, it is always recommended to schedule jobs using Slurm: this prevents allocation of multiple jobs on the same GPU, which can cause significant performance degradation.

For example

$ srun  --gpus=2 --pty bash -l # request a session with 2 GPUs

$ nvidia-smi -L
GPU 0: NVIDIA A100-SXM4-80GB (UUID: GPU-1768fcec-d945-7435-1f8e-85d30cdf310e)
GPU 1: NVIDIA A100-SXM4-80GB (UUID: GPU-6420b6b9-bb34-a58d-8090-61887fd97931)

See also notes on interactive jobs via Caltech-HPC:

Weekend scheduled runs on clima

(updated Jan 16, 2025)

ClimaAtmos longruns

  • Friday 10pm PST - (est.) Saturday 6pm PST
  • runs use 18 x 1 GPU (up to 12h each)

ClimaCoupler benchmarks

  • Saturday 9pm PST - (est.) Sunday 12am PST
  • runs use 4 x 4 GPUs (10-15 mins each)

ClimaCoupler longruns

  • Sunday 12am PST - (est.) Monday 12am PST
  • runs use 2 x 1 GPU on clima (22h each)

ClimaCoupler AMIP

  • Sunday 12am PST - (est.) Wednesday 12am PST
  • run uses 1 GPU (3 days)