Submitting Array Jobs¶
Back to SLURM Job Arrays
A job array can be submitted simply by adding
#SBATCH --array=x-y
to the job script where x and y are the array bounds. A job array can also be specified at the command line with
sbatch --array=x-y job_script.sbatch
A job array will then be created with a number of independent jobs a.k.a. array tasks that correspond to the defined array.
SLURM's job array handling is very versatile. Instead of providing a task range a comma-separated list of task numbers can be provided, for example, to rerun a few failed jobs from a previously completed job array as in
sbatch --array=4,8,15,16,23,42 job_script.sbatch
which can be used to quickly rerun the lost tasks from a previous job array for example. Command line options override options in the script, so those can be left unchanged.
Limiting the number of tasks that run at once¶
To throttle a job array by keeping only a certain number of tasks
active at a time use the %N
suffix where N is the number of active
tasks. For example
#SBATCH -a 1-200%5
will produce a 200 task job array with only 5 tasks active at any given time.
Note that while the symbol used is the % sign, this is the actual number of tasks to be submitted at once.
Using scontrol to modify throttling of running array jobs¶
If you want to change the number of simultaneous tasks of an active job, you can use scontrol:
scontrol update ArrayTaskThrottle= JobId= |
eg | scontrol update ArrayTaskThrottle=50 JobId=12345 |
Set ArrayTaskThrottle=0 to eliminate any limit.
Naming output and error files¶
SLURM uses the %A and %a replacement strings for the master job ID and task ID, respectively.
For example:
#SBATCH --output=Array_test.%A_%a.out
#SBATCH --error=Array_test.%A_%a.error
The error log is optional as both types of logs can be written to the 'output' log. Note: if you only use '%A' in the log all array tasks will try to write to a single file. The performance of the run will approach zero asymptotically. Make sure to use both %A and %a in the log file name specification.
#SBATCH --output=Array_test.%A_%a.log