GPU Access¶
GPUs Per Group¶
Normalized Graphics Processor Units (NGUs) include all of the infrastructure (memory, network, rack space, cooling) necessary for GPU-accelerated computation. Each NGU is equivalent to 1 GPU presently, however newer GPUs such as the B200s may require more than 1 NGU to access in the future.
In order to use GPU resources HiPerGator groups need to have an active NGU investment. Check if your group(s) has GPUs allocated and available with the command slurmInfo -g group_name
(with the module "ufrc" loaded).
Researchers can add NGUs to their allocations by filling out the Purchase Form or requesting a Trial Allocation
Open On Demand Access¶
Note
Interactive Open OnDemand Jobs in the GPU partition are limited to 12 hrs. Computational GPU jobs are limited to 14 days. Each GPU job requires at least one CPU core.
To access GPUs using Open OnDemand, you need to set the partition and a GRES (generic resource) with the number and (optionally) type of GPU.
GPU partitions:
hpg-ai
: The remaining A100 GPUs. These will be retired on Jun 24, 2025.hpg-b200
: The 2025 NVIDIA DGX B200 SuperPod.gpu
: Currently the 2080 Ti GPUs.hwgui
: Hardware accelerated GPU partition for visualization applications.
GRES string format:
-
To request one GPU:
gpu:1
-
To request multiple GPUs (where
n
is the number of GPUs you need):gpu:n
GPU Hardware Specifications¶
We have the following types of NVIDIA GPU nodes available:
GPU Specs | GeForce 2080Ti | GeForce 2080Ti | Quadro RTX 6000 SLI | NVIDIA A100 | NVIDIA B200 |
---|---|---|---|---|---|
Host Quantity | 32 | 38 | 6 | 70 | 31 |
Host Architecture | Intel Skylake | Intel Cascade Lake | Intel Cascade Lake | AMD EPYC ROME | Intel Xeon 8570 |
Host Memory | 187 GB | 187 GB | 187 GB | 2 TB | 2 TB |
Host Interconnect | EDR IB | EDR IB | EDR IB | HDR IB | ConnectX-7 |
CPUs per Host | 32 | 32 | 32 | 128 | 112 |
CPUS per Socket | 16 | 16 | 16 | 16 | 56 |
GPUs per Host | 8 | 8 | 8 | 8 | 8 |
CPUs per GPU | 4 | 4 | 4 | 16 | 14 |
Memory per GPU | 11 GB | 11 GB | 23GB | 80GB | 180 GB |
Slurm partition | gpu | gpu | gpu | gpu or hpg-ai | hpg-b200 |
Slurm Feature | 2080ti | 2080ti | rtx6000 | a100 | b200 |
GRES GPU type | geforce | geforce | quadro | a100 | b200 |
Technical Ref | Specifications |
For a list of additional node features, see the Available Node Features page.
To select a specific type of GPU within a partition please use either a
SLURM constraint (e.g. --constraint=rtx6000
) or a GRES with the needed
GPU type (--gres
or --gpu=a100:1
).
Compiling CUDA Enabled Programs¶
The most direct way to develop a custom GPU accelerated algorithm is with the CUDA programming, please refer to the Nvidia CUDA Toolkit page. The current CUDA environment is cuda/12. However, C++ or Python packages numba and PyCuda are other ways to program GPU algorithms.
Conda Environments with GPU¶
To make sure your code will run on GPUs install a recent cudatoolkit
package that works with the NVIDIA drivers on HPG (currently 12.x, but older versions are still supported) alongside the pytorch or tensorflow package(s). See RC provided tensorflow or pytorch installs for examples if needed. Mamba can detect if there is a gpu in the environment, so the easiest approach is to run the mamba install command in a gpu session.
You can also visit Conda for more information.
Slurm and GPU Use¶
View instructions for using GPUs and scheduling GPU jobs with SLURM at Slurm and GPU Use
Hardware Accelerated GUI¶
GPUs in these servers are used to accelerate rendering for graphical applications. These servers are in the SLURM "hwgui" partition.
There are several preset applications available in the Open OnDemand drop-down list (e.g. Freeview, Unreal Engine). You can run additional GUI applications by starting a Console or HiPerGator Desktop session, loading the application module and running the application.
To do this:
- Select the 'hwgui' partition for an Open OnDemand Console or HiPerGator Desktop Application. See Open OnDemand for details on using OOD.
- Once connected to the session, open a terminal and load the appropriate environment module and launch the application in question.