Please review this article about batch computing before trying to submit SLURM jobs to the clusters!
Here are some links to the recordings that explain in detail about this topic:
1. Introduction to submitting jobs on the CCR clusters (Password: CCR-HPC2020)
2. Job Submission Strategies & Utilizing Idle Nodes (Password: CCR-HPC2020)
We highly recommend viewing the "Introduction to submitting jobs on the CCR clusters" recording first.
NOTE: the term "script" is used throughout this documentation to mean an executable file that you create and submit to the job scheduler to run on a node or collection of nodes. The script will include a list of SLURM directives (or commands) to tell the job scheduler what to do. Details and options for these scripts are below.
The job flags are used with SBATCH command. The syntax for the SLURM directive in a script is "#SBATCH <flag>". Some of the flags are used with the srun and salloc commands, as well as the fisbatch wrapper script for interactive jobs.
Resource | Flag Syntax | Description | Notes |
partition | --partition=general-compute | Partition is a queue for jobs. Available partitions | default on ub-hpc is general-compute |
qos | --qos=general-compute | QOS is quality of service value (limits or priority boost) | default on ub-hpc is general-compute |
time | --time=01:00:00 | Time limit for the job. | 1 hour; default is 72 hours |
nodes | --nodes=2 | Number of compute nodes for the job. | default is 1; compute nodes |
cpus/cores | --ntasks-per-node=8 | Corresponds to number of cores on the compute node. | default is 1 |
node type | --constraint=IB or --constraint=IB&CPU-E564 | Node type feature. IB requests nodes with InfiniBand | default is no node type specified; compute nodes |
resource feature | --gres=gpu:2 | Request use of GPUs on compute nodes | default is no feature specified; |
memory | --mem=24000 | Memory limit per compute node for the job. Do not use with mem-per-cpu flag. | memory in MB; default limit is 3000MB per core |
memory | --mem-per-cpu=4000 | Per core memory limit. Do not use the mem flag, | memory in MB; default limit is 3000MB per core |
account | --account=group-slurm-account | Users may belong to groups or accounts. | default is the user's primary group. |
job name | --job-name="hello_test" | Name of job. | default is the JobID |
output file | --output=test.out | Name of file for stdout. | default is the JobID |
email address | --mail-user=username@buffalo.edu | User's email address | required |
email notification | --mail-type=ALL --mail-type=END | When email is sent to user. | omit for no email |
access | --exclusive | Exclusive acccess to compute nodes. | default is sharing nodes |
Helpful Hints:Requesting nodes with InfiniBandMPI jobs should request nodes that have InfiniBand.
Using the Debug NodesThere are 7 compute nodes in the debug partition.
The maximum time for a job is 1 hour. Submit a job to this partition using "--partition=debug". Using Large Memory Nodes#SBATCH --ntasks-per-node=32 #SBATCH --mem=250000 Using GPUs#SBATCH --gres=gpu:2 Requesting Specific Nodes#SBATCH -w, --nodelist=cpn-f16-35,cpn-f16-37 Excluding Specific Nodes#SBATCH --exclude=cpn-f16-35,cpn-f16-37 Creating a Node Listexport NODELIST=nodelist.$ srun -l bash -c 'hostname' | sort | awk '{print $2}' > $NODELIST cat $NODELIST |
Create a SLURM script using an editor such as vi or emacs using steps 1 through 3. The script (or file) can be called anything you want but should end in .sh (i.e. myscript.sh). If you are unfamiliar with the UNIX commands to edit files, please read this article
Step 1: Resource Specification
#!/bin/sh
#SBATCH --partition=general-compute --qos=general-compute
#SBATCH --time=00:15:00
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=8
#SBATCH --constraint=IB
#SBATCH --mem=23000
# Memory per node specification is in MB. It is optional.
# The default limit is 3000MB per core.
#SBATCH --job-name="hello_test"
#SBATCH --output=test-srun.out
#SBATCH --mail-user=username@buffalo.edu
#SBATCH --mail-type=ALL
##SBATCH --requeue
#Specifies that the job will be requeued after a node failure.
#The default is that the job will not be requeued.
Step 2: Variables, Paths and Modules
echo "SLURM_JOBID="$SLURM_JOBID
echo "SLURM_JOB_NODELIST"=$SLURM_JOB_NODELIST
echo "SLURM_NNODES"=$SLURM_NNODES
echo "SLURMTMPDIR="$SLURMTMPDIR
echo "working directory = "$SLURM_SUBMIT_DIR
module load intel/13.1
module load intel-mpi/4.1.3
module list
ulimit -s unlimited
#
Step 3: Launch Application
# The initial srun will trigger the SLURM prologue on the compute nodes.
NPROCS=`srun --nodes=${SLURM_NNODES} bash -c 'hostname' |wc -l`
echo NPROCS=$NPROCS
echo "Launch helloworld with srun"
#The PMI library is necessary for srun
export I_MPI_PMI_LIBRARY=/usr/lib64/libpmi.so
srun ./helloworld
#
echo "All Done!"
Step 4: Submit job
Step 5: Check Status of Job
R indicates that the job is running.
PD indicates that the job is pending. The job is waiting in the queue.
Cancel a Job
A queued or running job can be cancelled.