Sep 10, 2019

General HPCC

QuantGen/HPCC

Group storage spaces

Research space:

  • /mnt/research/quantgen

Scratch spaces:

  • /mnt/gs18/scratch/groups/quantgen
  • /mnt/ls15/scratch/groups/quantgen

Directory structure within research space

  • datasets
  • projects

Shared configuration

Slurm Basics

How to submit a job?

sbatch submission_script

Submission script must start with #!/bin/bash --login (otherwise our shared configuration files will not be loaded and you won't have access to our shared installation of R).

How to specify sbatch parameters?

Parameters can be specified on the command line (sbatch {params} submission_script) or within the submission script using specially formatted comments (#SBATCH param).

Basic parameters

Assuming you are using R:

Long name Short name Notes
--job-name -J
--mem Units: K|M|G|T
--time -t Format: hours:minutes:seconds
--mail-type BEGIN,FAIL,END

Advanced parameters

Assuming you are using R:

Long name Short name Notes
--cpus-per-task -c Only set if you want to use multiple cores
--array -a Format: 1-n, use Sys.getenv("SLURM_ARRAY_TASK_ID") to figure out which job in the array you are running
--tmp Units: K|M|G|T, reserves space on local file system (/mnt/local/job_id)

Irrelevant parameters

Assuming you are using R:

Long name Short name Notes
--nodes -N Always 1
--ntasks -n Always 1

Example submission script for R code

#!/bin/bash --login
#SBATCH --job-name=my_job
#SBATCH --time=4:00:00
#SBATCH --mem=10G

Rscript r_code.R

Why not R CMD BATCH?

  • Crazy defaults: --restore --save --no-readline
  • If you really want to use it, be sure to specify --no-restore --no-save

How to check on currently running jobs?

squeue -u ${USER} or squeue --user=${USER}

I don't like the default output format (job ID, partition, job name, user name, job state, time used, number of nodes, reason why pending) and prefer --Format=jobarrayid:20,name:40,state:10,timeused:15,nodelist.

How to check on past jobs?

sacct -u ${USER} or sacct --user=${USER}

--starttime defaults to midnight of the same day…

I don't like the default output format (jobs, job steps, status, and exitcodes) and prefer --format=jobid%20,jobname%30,elapsed,timelimit,maxrss,reqmem,exitcode.

How to cancel jobs?

scancel job_id

How to run an interactive job?

srun {params} --pty /bin/bash

Thanks