Skip to content

RCC Computing Guide

samueldmcdermott edited this page May 22, 2023 · 15 revisions

Getting a New Account

Here, we show the link needed for signing for an account, and then the appropriate answers to each question in the application.

  1. Obtain a new account. Visit this link: RCC Website Link
  2. Use the following responses to answer the application questions:
    1. Principal Investigator account name:
      pi-nord
      
    2. software and system tools that you anticipate using for computational research at the RCC:
      We will use scientific Python and deep learning codebases.
      
    3. A brief summary of your work that will use RCC resources:
      We will perform research at the intersection of physics, cosmology, and artificial intelligence.
      

If you want multiple accounts, apply as above and then re-apply for the second time, including the first account on that application (there's a space to list existing account affiliations).

   

VPN

RCC access does not require the use of a VPN, but it can make remote notebook access (see here) easier.

  1. Download Cisco AnyConnect Secure Mobility Client here
  2. Log into the VPN using the address vpn.uchicago.edu in the Cisco AnyConnect Secure Mobility Client dialog box
  3. Authenticate with your Duo multi-factor authentication application

Logging In

Instructions

  1. Log in with SSH at the command line,
    ssh <cnetid>@midway2.rcc.uchicago.edu
  2. Authenticate your ID with the Duo multi-fac application.
  3. Create an alias on your local machine to simplify your login:
    1. Open ~/.bash_profile locally (on your home machine).
    2. add this line:
      alias sshrcc='> ~/.ssh/known_hosts; ssh midway2.rcc.uchicago.edu'
    3. Save and exit the file.
    4. Test at the command line:
      sshrcc

Notes

  1. RCC often changes its IP address, which may cause errors on your local machine. This is why we recommend creating the alias.
  2. <cnetid> is your UChicago username.
  3. if the username on your computer is the same as your <cnetid>, you can use ssh midway2.rcc.uchicago.edu instead.

   

Alt Text


Nodes on RCC

  1. Login nodes are your landing node -- you always log in to a login node
  2. Compute node can be accessed from the login nodes -- use these for memory intensive computations

Login Nodes

  1. Functioning:
    1. Default node. It is the most robust way, because it assigns you to the least-used node if they're both up, or it assigns you to the live one if one is down. This is our best interpretation of the situation. We haven't confirmed with RCC staff.
    ssh midway2.rcc.uchicago.edu
    1. login1 (if you know you want to be on this particular node)
    ssh midway2-login1.rcc.uchicago.edu
    1. login2 (if you know you want to be on this particular node
    ssh midway2-login2.rcc.uchicago.edu
  2. Non-Functioning
    1. midway2-login3.rcc.uchicago.edu is not typically available
    2. midway.rcc.uchicago.edu is decommissioned

Compute Nodes

  1. the most common (and recommended) way to access compute nodes is by running sinteractive (with optional flags as described on the RCC website here, and in this guide here)
  2. if you want to run a job for longer than you wish to be actively logged in, you can use tmux
    1. tmux is the terminal multiplexer
    2. there are many guides online, e.g. here
    3. tmux is not loaded by default, so must be loaded before you start an interactive job
    4. we recommend loading it upon login as described here
  3. if you want to submit many jobs at once and don't want to have a different tmux session for everything, consider batch computing instead, as described here

   


Usage

  1. functionality is described in the user guide here.

  2. RCC recommends sinteractive for most use cases to select a node for compute.

  3. If kicpaa is your primary affiliation (rather than pinord) it might work without the flags.

  4. There is a separate partition for the KICP/A&Ap GPU allotment – to access that, do -- different partition as above, but the same account name

  5. To use a CPU:

    sinteractive -A kicpaa -p kicpaa
  6. To use a GPU:

    sinteractive -p kicpaa-gpu -A kicpaa

Bash profile

  1. Create a .bash_profile in your home directory on RCC and add any commands you want to run by default immediately upon logging in
  2. I always load tmux in case I wind up running a job for longer than I want to remain actively logged in, so I include
    module load tmux
  3. Activate a virtual environment (described more here):
    source activate <env_name>

Conda

Notes

  1. RCC uses conda for package management
  2. RCC provides many prebuilt conda modules.
  3. For guidance on creating virtual environments, see the second warning in this list of “mistakes to avoid”
  4. After creating a virtual environment named env_name, RCC prefers source activate <env_name> as described here

Instructions

  1. Load the latest Anaconda python (as of March 2023):
    module load python/anaconda-2021.05
  2. Use
    source activate
  3. Never use (has been known to break things like ThinLinc.):
    conda init
  4. Never use:
    conda activate

Notebooks

There is an RCC guide for running Jupyter notebooks available here. Usage depends on whether or not you're on VPN.

Submitting Batch jobs

Notes:

  1. For submitting batch jobs, there’s a guide
  2. lecture notes from a KICP class here.

The basic approach is as follows

  1. Write a script: mybatchscript.sh
  2. Submit a script:
    sbatch mybatchscript.sh <arguments>

The batch file can contain use different flags. For instance, to run on the kicpaa partition with the kicpaa account and passing two arguments to a script mypyscript.py, the contents would look like

#!/bin/bash

#SBATCH --job-name=Hruns
#SBATCH --time=02:00:00
#SBATCH --account=kicpaa
#SBATCH --partition=kicpaa
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=1
#SBATCH --mem-per-cpu=12G

python mypyscript.py  

and the two arguments get passed to mypyscript.py via sys.argv (for example). With these tools, you can iterate over the two arguments from 10 to 15 and 0 to 5, respectively, with for i in {0..5}; do for j in {10..15}; do sbatch <file name>.sh $j $i; done; done This might be unnecessarily slow if SLURM doesn’t want to dispatch that many independent jobs. An alternative would be to use the array flag of sbatch and instead mybatchscript.sh looks like

#!/bin/bash

#SBATCH --job-name=Hruns
#SBATCH --time=02:00:00
#SBATCH --array=10-15
#SBATCH --account=kicpaa
#SBATCH --partition=kicpaa
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=1
#SBATCH --mem-per-cpu=12G

python mypyscript.py $SLURM_ARRAY_TASK_ID $1

The array flag will ensure that mypyscript is run with the first argument taking on values from 10-15 (inclusive) for any values of the second argument. To iterate over the second argument from 0-5 (inclusive), use the following at the CLI:

for i in {0..5}; do sbatch mybatchscript.sh $i; done

These two sets of scripts and commands will result in the same output, but they are handled by the scheduler differently (eg, the job names will have subscripts according to their $SLURM_ARRAY_TASK_ID value, so the output files will be slurm_123456789_0.out, slurm_123456789_1.out, etc. rather than slurm_123456789.out, slurm_123456790.out, slurm_123456791.out, etc.)

Using Git Repositories

Notes

Compute nodes aren't connected to the internet, so if you want to clone a GitHub repository hosted at https://github.com/<myrepo> do the following on the login node.

  1. Log in to a login node
  2. CLI:
    git clone https://github.com/myrepo.git
  3. Provide your username and either your password (which you will need to reenter every time) or a token (as documented by git here)
  4. if you need compilers to install a development package, RCC has gcc, though it must be loaded with module load gcc

DeepSkies Toolbox

Currently Available Software

  1. DeepTemplate-Tools
  2. DeepTemplate-Science

Coming soon

  1. DeepBench
  2. DeepGotData
  3. DeepUtils

Computational Facilities

  1. Google Colaboratory
  2. Elastic Analysis Facility (EAF; Fermilab)
    1. EAF ReadtheDocs
    2. Quick OnboardingGuide
  3. Research Computing Center (UChicago)

Computing Guides:

  1. coming soon.
Clone this wiki locally