R¶
Description¶
R is a language and environment for statistical computing and graphics. R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, and graphical techniques. It is highly extensible. One of R's strengths is the ease with which well-designed publication-quality plots can be produced, including mathematical symbols and formulae where needed. Note: This module's environment is compatible with rstudio/1.1.419, so the personal packages installed under either module will work. The default installation directory is ~/R/x86_64-pc-linux-gnu-library/3.4/. Modules system sets up the following environment variables for this module:
Environment Modules¶
Run module spider R
to find out what environment modules are available for this application.
Environment Variables¶
- HPC_R_DIR - installation directory
- HPC_R_BIN - executable directory
Additional Usage Information¶
R can be run on the command-line (or the batch system) using the Rscript myscript.R
or R CMD BATCH myscript.R
command. For script development or visualization RStudio GUI
application can be used. See the Open OnDemand documentation for
details. Alternatively an instance of RStudio_Server can be started in a job.
Then you can connect to it through an SSH tunnel from a web browser on your local computer.
Notes and Warnings¶
-
The parallel::detectCores() function will return the total number of cores on a compute node and not the number of cores assigned to your job by the scheduler. Instead, use something like
numCores = as.integer(Sys.getenv("SLURM_CPUS_ON_NODE"))
to find out the number of CPU cores 'X' requested in your job script by:#SBATCH --cpus-per-task=X
-
Default RData format In R-3.6.0 the default serialization format used to save RData files has been changed to version 3 (RDX3), so R versions prior to 3.5.0 will not be able to open it. Keep this in mind if you copy RData files from HiPerGator to an external system with old R installed.
-
Java rJava users need to load the java module manually with
module load java/1.7.0_79
. Use the correct java module version for your case. -
TMPDIR If temporary files are produced the may fill up memory disks on HPG2 nodes and cause node and job failures. Use something like
in your job script to prevent this and launch your job from the respective directory and not from your home directory.mkdir -p tmp export TMPDIR=$(pwd)/tmp
-
For users of PHI and FERPA: It is particularly important to set your working and TMPDIR directories to be in your project's PHI/FERPA configured directory in
/blue
when working with R. Writing files to$HOME
or$TMPDIR
could expose restricted data to unauthorized users. -
Tasks vs Cores for parallel runs Parallel threads in an R job will be bound to the same CPU core even if multiple ntasks are specified in the job script. Use cpus-per-task to use R 'parallel' module correctly. For example, for an 8-thread parallel job use the following resource request in your job script:
#SBATCH --nodes=1 #SBATCH --ntasks=1 #SBATCH --cpus-per-task=8
-
See the single-threaded and multi-threaded examples in the Sample SLURM Scripts page for more details.
Categories¶
statistics