RCC Computing Guide

Getting a New Account

Here, we show the link needed for signing for an account, and then the appropriate answers to each question in the application.

Obtain a new account. Visit this link: RCC Website Link
Use the following responses to answer the application questions:
1. Principal Investigator account name:
```
pi-nord
```
2. software and system tools that you anticipate using for computational research at the RCC:
```
We will use scientific Python and deep learning codebases.
```
3. A brief summary of your work that will use RCC resources:
```
We will perform research at the intersection of physics, cosmology, and artificial intelligence.
```

If you want multiple accounts, apply as above and then re-apply for the second time, including the first account on that application (there's a space to list existing account affiliations).

VPN

RCC access does not require the use of a VPN, but it can make remote notebook access (see here) easier.

Download Cisco AnyConnect Secure Mobility Client here
Log into the VPN using the address vpn.uchicago.edu in the Cisco AnyConnect Secure Mobility Client dialog box
Authenticate with your Duo multi-factor authentication application

Logging In

Instructions

Log in with SSH at the command line,
```
ssh <cnetid>@midway2.rcc.uchicago.edu
```
Authenticate your ID with the Duo multi-fac application.
Create an alias on your local machine to simplify your login:
1. Open ~/.bash_profile locally (on your home machine).
2. add this line:
```
alias sshrcc='> ~/.ssh/known_hosts; ssh midway2.rcc.uchicago.edu'
```
3. Save and exit the file.
4. Test at the command line:
```
sshrcc
```

Notes

RCC often changes its IP address, which may cause errors on your local machine. This is why we recommend creating the alias.
<cnetid> is your UChicago username.
if the username on your computer is the same as your <cnetid>, you can use ssh midway2.rcc.uchicago.edu instead.

Alt Text

Nodes on RCC

Login nodes are your landing node -- you always log in to a login node
Compute node can be accessed from the login nodes -- use these for memory intensive computations

Login Nodes

Functioning:
1. Default node. It is the most robust way, because it assigns you to the least-used node if they're both up, or it assigns you to the live one if one is down. This is our best interpretation of the situation. We haven't confirmed with RCC staff.
```
ssh midway2.rcc.uchicago.edu
```
1. login1 (if you know you want to be on this particular node)
```
ssh midway2-login1.rcc.uchicago.edu
```
1. login2 (if you know you want to be on this particular node
```
ssh midway2-login2.rcc.uchicago.edu
```
Non-Functioning
1. midway2-login3.rcc.uchicago.edu is not typically available
2. midway.rcc.uchicago.edu is decommissioned

Compute Nodes

the most common (and recommended) way to access compute nodes is by running sinteractive (with optional flags as described on the RCC website here, and in this guide here)
if you want to run a job for longer than you wish to be actively logged in, you can use tmux
1. tmux is the terminal multiplexer
2. there are many guides online, e.g. here
3. tmux is not loaded by default, so must be loaded before you start an interactive job
4. we recommend loading it upon login as described here
if you want to submit many jobs at once and don't want to have a different tmux session for everything, consider batch computing instead, as described here

Usage

functionality is described in the user guide here.
RCC recommends sinteractive for most use cases to select a node for compute.
If kicpaa is your primary affiliation (rather than pinord) it might work without the flags.
There is a separate partition for the KICP/A&Ap GPU allotment – to access that, do -- different partition as above, but the same account name
To use a CPU:
```
sinteractive -A kicpaa -p kicpaa
```
To use a GPU:
```
sinteractive -p kicpaa-gpu -A kicpaa
```

Bash profile

Create a .bash_profile in your home directory on RCC and add any commands you want to run by default immediately upon logging in
I always load tmux in case I wind up running a job for longer than I want to remain actively logged in, so I include
```
module load tmux
```
Activate a virtual environment (described more here):
```
source activate <env_name>
```

Conda

Notes

RCC uses conda for package management
RCC provides many prebuilt conda modules.
For guidance on creating virtual environments, see the second warning in this list of “mistakes to avoid”
After creating a virtual environment named env_name, RCC prefers source activate <env_name> as described here

Instructions

Load the latest Anaconda python (as of March 2023):
```
module load python/anaconda-2021.05
```
Use
```
source activate
```
Never use (has been known to break things like ThinLinc.):
```
conda init
```
Never use:
```
conda activate
```

Notebooks

There is an RCC guide for running Jupyter notebooks available here. Usage depends on whether or not you're on VPN.

Submitting Batch jobs

Notes:

For submitting batch jobs, there’s a guide
lecture notes from a KICP class here.

The basic approach is as follows

Write a script: mybatchscript.sh
Submit a script:
```
sbatch mybatchscript.sh <arguments>
```

The batch file can contain use different flags. For instance, to run on the kicpaa partition with the kicpaa account and passing two arguments to a script mypyscript.py, the contents would look like

#!/bin/bash

#SBATCH --job-name=Hruns
#SBATCH --time=02:00:00
#SBATCH --account=kicpaa
#SBATCH --partition=kicpaa
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=1
#SBATCH --mem-per-cpu=12G

python mypyscript.py

and the two arguments get passed to mypyscript.py via sys.argv (for example). With these tools, you can iterate over the two arguments from 10 to 15 and 0 to 5, respectively, with for i in {0..5}; do for j in {10..15}; do sbatch <file name>.sh $j $i; done; done This might be unnecessarily slow if SLURM doesn’t want to dispatch that many independent jobs. An alternative would be to use the array flag of sbatch and instead mybatchscript.sh looks like

#!/bin/bash

#SBATCH --job-name=Hruns
#SBATCH --time=02:00:00
#SBATCH --array=10-15
#SBATCH --account=kicpaa
#SBATCH --partition=kicpaa
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=1
#SBATCH --mem-per-cpu=12G

python mypyscript.py $SLURM_ARRAY_TASK_ID $1

The array flag will ensure that mypyscript is run with the first argument taking on values from 10-15 (inclusive) for any values of the second argument. To iterate over the second argument from 0-5 (inclusive), use the following at the CLI:

for i in {0..5}; do sbatch mybatchscript.sh $i; done

These two sets of scripts and commands will result in the same output, but they are handled by the scheduler differently (eg, the job names will have subscripts according to their $SLURM_ARRAY_TASK_ID value, so the output files will be slurm_123456789_0.out, slurm_123456789_1.out, etc. rather than slurm_123456789.out, slurm_123456790.out, slurm_123456791.out, etc.)

Using Git Repositories

Notes

Compute nodes aren't connected to the internet, so if you want to clone a GitHub repository hosted at https://github.com/<myrepo> do the following on the login node.

Log in to a login node

CLI:

git clone https://github.com/myrepo.git

Provide your username and either your password (which you will need to reenter every time) or a token (as documented by git here)
if you need compilers to install a development package, RCC has gcc, though it must be loaded with module load gcc

DeepSkies Toolbox

Currently Available Software

Coming soon

DeepBench
DeepGotData
DeepUtils

Computational Facilities

Google Colaboratory
Elastic Analysis Facility (EAF; Fermilab)
1. EAF ReadtheDocs
2. Quick OnboardingGuide
Research Computing Center (UChicago)

Computing Guides:

coming soon.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RCC Computing Guide

Getting a New Account

VPN

Logging In

Instructions

Notes

Nodes on RCC

Login Nodes

Compute Nodes

Usage

Bash profile

Conda

Notes

Instructions

Notebooks

Submitting Batch jobs

Notes:

Using Git Repositories

Notes

DeepSkies Toolbox

Currently Available Software

Coming soon

Computational Facilities

Computing Guides:

Clone this wiki locally