Annotated SLURM¶
This is a walk-through for a basic SLURM scheduler job script for a common case of a multi-threaded analysis. If the program you run is single-threaded (can use only one CPU core) then only use '--ntasks=1' line for the cpu request instead of all three listed lines. Annotations are marked with bullet points.
The full script is provided at the end of the article. Values in brackets (<>
) are placeholders.
You need to replace them with your own values (e.g. change <JOBNAME>
to something like "blast_proj22"). We will write additional
documentation on more complex job layouts for MPI jobs and other
situations when a simple number of processor cores is not sufficient.
Set the shell to use¶
#!/bin/bash
Common SBATCH options¶
job-name¶
job-name
is used to name the job to make it easier to see in the job queue.
#SBATCH --job-name=<JOBNAME>
mail-user and mail-type¶
mail-user
allows users to provie an email address to use for batch system communications.
#SBATCH --mail-user=<EMAIL>
#SBATCH --mail-user=<EMAIL-ONE>,<EMAIL-TWO>
mail-type
allows users to filter when emails should be sent to the provided addresses by providing one or more (comma-separated) of the following options:
NONE
- No emails for this job.ALL
- All emails for this job.BEGIN
- Email when this job starts.END
- Email a summary at the end of the job.FAIL
- Email if this job fails.
#SBATCH --mail-type=FAIL,END
output and error¶
output
and error
are paths to where the stdout
and stderr
for the script should be written to after a job runs.
Some common file patterns:
%j
- Job ID%A-%a
- Array job ID (A) and task ID (a)
#SBATCH --output <my_job_%j.out>
nodes¶
nodes
indicates the number of nodes to use when running this script.
For all non-MPI jobs, this number will be equal to 1.
#SBATCH --nodes=1
ntasks¶
ntasks
indicates the number of tasks associated with this script.
For all non-MPI jobs, this number will be equal to 1.
#SBATCH --ntasks=1
cpus-per-task¶
cpus-per-task
indicates the number of CPU cores to use when running this script.
This number must match the argument used for the program being run.
#SBATCH --cpus-per-task=4
mem¶
mem
indicates the total memory limit for the job.
Units can be specified with mb
(megabytes) or gb
(gigabytes).
The default memory is 2 gigabytes.
#SBATCH --mem=4gb
time¶
time
indicates the job's total run time.
The run time is formatted days:hours:minutes:seconds
. days
is optional, but should be used when convenient.
The default run time is 10 minutes.
#SBATCH --time=72:00:00
account¶
account
indicates what group this script belongs to.
#SBATCH --account=<GROUP>
array¶
array
will create a job array for the script, which creates many jobs (called array tasks) that differ only in their $SLURM_ARRAY_TASK_ID
.
#SBATCH --array=<BEGIN-END>
#SBATCH --array=<1-5>
Recommended convenient shell code¶
Host/time/directory information¶
It is recommended to add host, time, and directory information to your job script using hostname
, date
, and pwd
.
hostname;date;pwd
It is also recommended to add an extra call of date
to the end of a script.
module load¶
To ensure your script can run correctly, use module load
to load the environment modules necessary to access the software needed for the script.
module load ncbi_blast
Full shell script example¶
#SBATCH --job-name=<JOBNAME>
#SBATCH --mail-user=<ACCOUNT>@ufl.edu
#SBATCH --mail-type=END,FAIL
#SBATCH --output=/blue/<ACCOUNT>/job-outs/blastn.out
#SBATCH --err=/blue/<ACCOUNT>/job-outs/blastn.err
#SBATCH --cpus-per-task=4
#SBATCH --mem=8gb
#SBATCH --time=2:00:00
hostname;pwd;date
module load ncbi_blast
blastn -db nt -query input.fa -outfmt 6 -out results.xml --num_threads 4
date