Slurm scrontab¶
On HiPerGator you can use scrontab (Slurm Crontab) to schedule periodically recurring jobs in Slurm.
scrontab uses a syntax similar to the traditional Unix/Linux cron jobs utilities. See the Slurm
srontab documentation for full details.
scrontab combines the same functionality as cron with the resiliency of the batch
system. Jobs are run on a cluster of nodes, so unlike with regular cron, a
single node going down won't keep your scrontab job from running. You can also
find and modify your scrontab jobs on any login node.
scrontab jobs are Slurm jobs and must request resources (CPUs, memory, time, etc.), just like any job.
scrontab jobs are managed by editing thescrontab for each user. You do not use sbatch at all to submit these jobs.
List your current scrontab¶
You can view your existing tasks (if any) in scrontab with:
scrontab -l
Set up or edit your scrontab¶
Run scrontab -e to create or edit your scrontab file. The default editor for
scrontab is vi, but you can specify your favorite editor. For example if you
prefer to use nano to edit files, run:
EDITOR=nano scrontab -e
You can also define the environment variable EDITOR to change the default
editor prior to launching scrontab -e, for example:
export EDITOR=/usr/bin/nano
In scrontab the lines that start with #SCRON work like the #SBATCH directives for regular Slurm batch jobs.
Slurm will ignore #SBATCH directives in scripts you run as scrontab jobs. You
can use most of the common sbatch options just as you would using sbatch on the
command line. The first line after your #SCRON directives specifies the schedule for your job and the command to run.
Note
By default, scrontab jobs will start in the home directory. You can change that using the --chdir Slurm option.
scrontab uses the same syntax for date and time specifiers as cron. Each line has five fields that have the following meanings:
| Field | Allowed values |
|---|---|
| Minute | 0-59 |
| Hour | 0-23 |
| Day of month | 1-31 |
| Month | 1-12 (or name) |
| Day of week | 0-7 (0 and 7 are Sunday, or use name) |
A field can contain an asterisk (*) which means that it's valid for each of the allowed values for the given time period. Ranges are allowed where a range is two numbers with a hyphen between them. The second number must be greater than the first. Lists are allowed, with commas separating the numbers or ranges being separated. Step values can be specified by entering a slash (/), followed by the step value, causing the job to run at the specified interval appropriate for that field.
Some common examples:
| Time string | When it runs |
|---|---|
0 * * * * |
Every hour |
0 2 * * * |
Every day at 2:00 am |
0 0 * * FRI |
Every Friday at midnight |
0 0 1 */2 * |
Every other month |
0 0 1 */3 * |
Every quarter |
Especially at first you may find it easiest to use a helper application to generate your cron date fields, such as crontab-generator or cronhub.io.
You can also use the short-hand syntax @hourly, @daily, @weekly, etc. instead of the five separate columns. These shortcuts include the following:
| Shortcut | Interval |
|---|---|
@yearly or @annually |
Job will become eligible at 00:00 Jan 01 each year. |
@monthly |
Job will become eligible at 00:00 on the first day of each month. |
@weekly |
Job will become eligible at 00:00 Sunday of each week. |
@daily or @midnight |
Job will become eligible at 00:00 each day. |
@hourly |
Job will become eligible at the first minute of each hour. |
@elevenses |
Job will become eligible at 11:00 each day. (This is a non-standard extension.) |
@fika |
Job will become eligible at 15:00 each day. (This is a non-standard extension.) |
@teatime |
Job will become eligible at 16:00 each day. (This is a non-standard extension.) |
Some notes¶
-
All recurrences of the
scrontabjob use the samejobID. This can result in overwriting the job--outputif%jis used, for example. This can be avoided using a timestamp like:#SCRON --output myjob_$(date +%Y%m%d%H%M).out -
If you are running a script in
scrontabits permissions must be set to be executable. (e.g.chmod u+x my_script.sh) -
Jobs handled by
scrontabdo not run in a full login shell, so if customizations in your.bashrcfile are needed, you need to add:source ~/.bashrcto your script to ensure that your environment is set up correctly.
-
To see Slurm accounting of a job handled by
scrontabusesacct, for example for job 12345 run:sacct --duplicates --jobs 12345or with short arguments
sacct -Dj 12345 -
If Slurm detects an error in your configuration when saving the file, it will let you know, and mark the problematic lines with
#BAD. You will need to correct the error before the file can be saved and thescrontabupdated. -
If Slurm resources are not available when your job is scheduled to run, your job may be delayed. There is no gurantee that your job will start at the specified time.
The command you specify in the scrontab is executed via bash, NOT sbatch
Since the commands in a scrontab job are executed by the bash shell you can list
multiple commands separated by a semicolon (;), and use other shell features,
such as redirects.
Any #SBATCH directives in executed scripts will be ignored
as the # starts a 'comment' in a shell.
scrontab examples¶
Note that unlike sbatch scripts, scrontab files do not include #!/bin/sh lines.
This example submits a 6-hour test job, requesting four cores, eligible to start every day at 12:00 AM:
#SCRON --time 6:00:00
#SCRON --cpus-per-task 4
#SCRON --name "daily_test"
#SCRON --chdir /home/myusername/test
#SCRON -o myoutput/%j-out.txt
@daily ./mytest.sh
The following example runs a test script eligible to start every Wednesday at 8:00 PM. The job will have one core and can run for an hour. (Note that the log file will be overwritten each time as it uses the %j naming for the output file, and all jobs use the same jobID):
#SCRON --time 1:00:00
#SCRON --chdir /home/myusername/test
#SCRON -o test_log_%j.txt
0 20 * * 3 ./mytest.sh
The example below checks every hour whether an instance of the test job is
running, and if not, it will start it. This avoids multiple instances of the
same job from running by using the "--dependency=singleton" option in the
scrontab:
#SCRON --qos=mygroup
#SCRON --account=myaccount
#SCRON --time=30-00:00:00
#SCRON --dependency=singleton
#SCRON --name=mytest
0 * * * * ./mytest.sh
Multiple tasks
If you have multiple tasks to run that use the same resource requests, you can simply add multiple lines with their schedule as in a normal crontab entry.
If each task needs different resources (time, CPUs, memory, etc.), you can add additional blocks of #SCRON resource requests one after each other in the scrontab.
Monitor your scrontab jobs¶
You can monitor your scrontab jobs with
squeue --me -q cron -O JobID,EligibleTime
This will show the next time the batch system will run your job. If the
scrontab job is set to repeat, the system will automatically reschedule the
next job. Additionally, if you modify your scrontab job, Slurm will
automatically cancel the old job and resubmit an new one.
Email notifications¶
There are two kinds of email notifications you can configure: Slurm job emails (BEGIN, END, FAIL, etc), and emails sent from within scripts.
Slurm job emails¶
Slurm emails can be enabled in a similar fashion to other slurm jobs (e.g. --mail-user and --mail-type).
However, if the job is run frequently, this can lead to lots of email messages and they might have issues with various email providers. To reduce email notifications for frequently running jobs, users can determine a limited condition on when the email is sent (i.e. #SCRON --mail-type=FAIL).
Email from a job script¶
The email tool msmtp is installed on the HiPerGator compute nodes. This tool can be used to send emails from scripts. It is important to consider rate limiting (keep it simple, no bulk emailing, etc.).
As an example, in a bash script, define the following variables: $mail_from, $mail_to, $subject, $body,
Also set these with suggested values: mail_host='smtp.ufhpc', mail_port='25'
Then, to send an email:
#!/bin/bash
mail_host="smtp.ufhpc"
mail_port="25"
mail_from="your@email.address"
mail_to="recipient@email.address"
subject="Test Run Scron - SUCCESS!"
body="$(blue_quota)"
printf 'From: %s\nTo: %s\nSubject: %s\n\n%s\n' \
"$mail_from" "$mail_to" "$subject" "$body" \
| msmtp --host="$mail_host" --port="$mail_port" --from="$mail_from" "$mail_to"
echo "This is a test."
Monitor runs with a log file¶
A simpler option is to set up a job log file, writing your desired information using date or other timestamps. For example: to log every time your script runs, you can add to the end of a log file:
echo "scrontab ran at timestamp: `date`" >> $HOME/scron/logs/testlogs.txt
Cancel a scrontab Job¶
To remove a scontab job from your running jobs you can edit the scontab file
with scrontab -e and comment out all the lines associated with the entry.
Using scancel command to remove a job started with scrontab
will give a warning like:
$ scancel 12345
scancel: error: Kill job error on job id 12345: Cannot scancel a scrontab job
without the --hurry flag, or modify scrontab jobs through scontrol
By canceling a scontab job with the "--hurry" flag, the entry in the scrontab
file will be prepended with #DISABLED. These comments will need to be removed
before the job can start again.