Slurm scrontab
¶
On HiPerGator you can use scrontab
(Slurm Crontab) to schedule periodically recurring jobs in Slurm.
scrontab
uses a syntax similar to the traditional Unix/Linux cron
jobs utilities. See the Slurm
srontab
documentation for full details.
scrontab
combines the same functionality as cron
with the resiliency of the batch
system. Jobs are run on a cluster of nodes, so unlike with regular cron
, a
single node going down won't keep your scrontab
job from running. You can also
find and modify your scrontab
jobs on any login node.
scrontab
jobs are Slurm jobs and must request resources (CPUs, memory, time, etc.), just like any job.
scrontab
jobs are managed by editing thescrontab
for each user. You do not use sbatch
at all to submit these jobs.
List your current scrontab
¶
You can view your existing tasks (if any) in scrontab
with:
scrontab -l
Set up or edit your scrontab
¶
Run scrontab -e
to create or edit your scrontab
file. The default editor for
scrontab
is vi
, but you can specify your favorite editor. For example if you
prefer to use nano
to edit files, run:
EDITOR=nano scrontab -e
You can also define the environment variable EDITOR
to change the default
editor prior to launching scrontab -e
, for example:
export EDITOR=/usr/bin/nano
In scrontab
the lines that start with #SCRON
work like the #SBATCH
directives for regular Slurm batch jobs.
Slurm will ignore #SBATCH
directives in scripts you run as scrontab
jobs. You
can use most of the common sbatch
options just as you would using sbatch
on the
command line. The first line after your #SCRON
directives specifies the schedule for your job and the command to run.
Note
By default, scrontab
jobs will start in the home directory. You can change that using the --chdir
Slurm option.
scrontab
uses the same syntax for date and time specifiers as cron
. Each line has five fields that have the following meanings:
Field | Allowed values |
---|---|
Minute | 0-59 |
Hour | 0-23 |
Day of month | 1-31 |
Month | 1-12 (or name) |
Day of week | 0-7 (0 and 7 are Sunday, or use name) |
A field can contain an asterisk (*
) which means that it's valid for each of the allowed values for the given time period. Ranges are allowed where a range is two numbers with a hyphen between them. The second number must be greater than the first. Lists are allowed, with commas separating the numbers or ranges being separated. Step values can be specified by entering a slash (/
), followed by the step value, causing the job to run at the specified interval appropriate for that field.
Some common examples:
Time string | When it runs |
---|---|
0 * * * * |
Every hour |
0 2 * * * |
Every day at 2:00 am |
0 0 * * FRI |
Every Friday at midnight |
0 0 1 */2 * |
Every other month |
0 0 1 */3 * |
Every quarter |
Especially at first you may find it easiest to use a helper application to generate your cron date fields, such as crontab-generator or cronhub.io.
You can also use the short-hand syntax @hourly
, @daily
, @weekly
, etc. instead of the five separate columns. These shortcuts include the following:
Shortcut | Interval |
---|---|
@yearly or @annually |
Job will become eligible at 00:00 Jan 01 each year. |
@monthly |
Job will become eligible at 00:00 on the first day of each month. |
@weekly |
Job will become eligible at 00:00 Sunday of each week. |
@daily or @midnight |
Job will become eligible at 00:00 each day. |
@hourly |
Job will become eligible at the first minute of each hour. |
@elevenses |
Job will become eligible at 11:00 each day. (This is a non-standard extension.) |
@fika |
Job will become eligible at 15:00 each day. (This is a non-standard extension.) |
@teatime |
Job will become eligible at 16:00 each day. (This is a non-standard extension.) |
Some notes¶
-
All recurrences of the
scrontab
job use the samejobID
. This can result in overwriting the job--output
if%j
is used, for example. This can be avoided using a timestamp like:#SCRON --output myjob_$(date +%Y%m%d%H%M).out
-
If you are running a script in
scrontab
its permissions must be set to be executable. (e.g.chmod u+x my_script.sh
) -
Jobs handled by
scrontab
do not run in a full login shell, so if customizations in your.bashrc
file are needed, you need to add:source ~/.bashrc
to your script to ensure that your environment is set up correctly.
-
To see Slurm accounting of a job handled by
scrontab
usesacct
, for example for job 12345 run:sacct --duplicates --jobs 12345
or with short arguments
sacct -Dj 12345
-
If Slurm detects an error in your configuration when saving the file, it will let you know, and mark the problematic lines with
#BAD
. You will need to correct the error before the file can be saved and thescrontab
updated. -
If Slurm resources are not available when your job is scheduled to run, your job may be delayed. There is no gurantee that your job will start at the specified time.
The command you specify in the scrontab
is executed via bash, NOT sbatch
Since the commands in a scrontab
job are executed by the bash shell you can list
multiple commands separated by a semicolon (;
), and use other shell features,
such as redirects.
Any #SBATCH
directives in executed scripts will be ignored
as the #
starts a 'comment' in a shell.
scrontab
examples¶
Note that unlike sbatch
scripts, scrontab
files do not include #!/bin/sh
lines.
This example submits a 6-hour test job, requesting four cores, eligible to start every day at 12:00 AM:
#SCRON --time 6:00:00
#SCRON --cpus-per-task 4
#SCRON --name "daily_test"
#SCRON --chdir /home/myusername/test
#SCRON -o myoutput/%j-out.txt
@daily ./mytest.sh
The following example runs a test script eligible to start every Wednesday at 8:00 PM. The job will have one core and can run for an hour. (Note that the log file will be overwritten each time as it uses the %j
naming for the output file, and all jobs use the same jobID):
#SCRON --time 1:00:00
#SCRON --chdir /home/myusername/test
#SCRON -o test_log_%j.txt
0 20 * * 3 ./mytest.sh
The example below checks every hour whether an instance of the test job is
running, and if not, it will start it. This avoids multiple instances of the
same job from running by using the "--dependency=singleton
" option in the
scrontab
:
#SCRON --qos=mygroup
#SCRON --account=myaccount
#SCRON --time=30-00:00:00
#SCRON --dependency=singleton
#SCRON --name=mytest
0 * * * * ./mytest.sh
Multiple tasks
If you have multiple tasks to run that use the same resource requests, you can simply add multiple lines with their schedule as in a normal crontab
entry.
If each task needs different resources (time, CPUs, memory, etc.), you can add additional blocks of #SCRON
resource requests one after each other in the scrontab
.
Monitor your scrontab
jobs¶
You can monitor your scrontab jobs with
squeue --me -q cron -O JobID,EligibleTime
This will show the next time the batch system will run your job. If the
scrontab
job is set to repeat, the system will automatically reschedule the
next job. Additionally, if you modify your scrontab
job, Slurm will
automatically cancel the old job and resubmit an new one.
Email notifications¶
There are two kinds of email notifications you can configure: Slurm job emails (BEGIN, END, FAIL, etc), and emails sent from within scripts.
Slurm job emails¶
Slurm emails can be enabled in a similar fashion to other slurm jobs (e.g. --mail-user
and --mail-type
).
However, if the job is run frequently, this can lead to lots of email messages and they might have issues with various email providers. To reduce email notifications for frequently running jobs, users can determine a limited condition on when the email is sent (i.e. #SCRON --mail-type=FAIL
).
Email from a job script¶
The email tool msmtp
is installed on the HiPerGator compute nodes. This tool can be used to send emails from scripts. Again, it is important to consider rate limiting.
As an example, in a bash script, define the following variables: $mail_from
, $mail_to
, $subject
, $body
,
Also set these with suggested values: mail_host='smtp.ufhpc'
, mail_port='25'
Then, to send an email:
echo "From: $mail_from\nTo: $mail_to\nSubject: $subject\n\n$body" \
| msmtp --host=$mail_host --port=$mail_port --from=$mail_from $mail_to
Monitor runs with a log file¶
A simpler option is to set up a job log file, writing your desired information using date
or other timestamps. For example: to log every time your script runs, you can add to the end of a log file:
echo "scrontab ran at timestamp: `date`" >> $HOME/scron/logs/testlogs.txt
Cancel a scrontab
Job¶
To remove a scontab
job from your running jobs you can edit the scontab
file
with scrontab -e
and comment out all the lines associated with the entry.
Using scancel
command to remove a job started with scrontab
will give a warning like:
$ scancel 12345
scancel: error: Kill job error on job id 12345: Cannot scancel a scrontab job
without the --hurry flag, or modify scrontab jobs through scontrol
By canceling a scontab
job with the "--hurry" flag, the entry in the scrontab
file will be prepended with #DISABLED
. These comments will need to be removed
before the job can start again.