Skip to content

Storage

UF Research Computing maintains several shared storage systems that are intended for different user activities. A summary of the filesystems and their use can be found in the RC storage policy. Here we discuss practical use of the filesystems on HiPerGator. See our FAQ to see what to do when you run out of storage space.

Home Storage

Quick notes on Home storage

  • Quota in home directory is 40GB.
  • Do not use /home job input and output (reading or writing files).
  • There is one week of daily snapshots maintained at ~/.snapshot/.
  • Check your quota with the home_quota command.

Your home directory is the first directory you see when you log into HiPerGator. It's always found at ~, /home/$USER or $HOME paths. The shell variables above can be used in scripts.

The home directories are the smallest storage devices available to our users. They contain files important for setting up user shell environment and secure shell connections. Do not remove the ~/.bashrc or ~/.bash_profile files or the .ssh directory; you will have problems using your HiPerGator account. If you do run into issues, Open a support request, and we can reset the files to standard versions.

The first rule of using the home directory is to not use it for reading or writing data files in any analyses run on HiPerGator. It is permissible to keep software builds, conda environments, text documents, and valuable scripts in $HOME as it is somewhat protected by daily snapshots.

Blue Storage

Quick notes on Blue storage

  • Blue is our main high-performance parallel filesystem
  • Blue is the primary location that should be used for all files read or written during job execution.
  • Each group should have a Blue folder at /blue/groupname/.
  • Quotas are based on investment and are at the group level.
  • Check your quota with the blue_quota command.

Blue Storage is our main high-performance parallel filesystem. This is where all job input/output a.k.a 'job i/o' or reading and writing files must happen. By default your personal directory tree will start at /blue/groupname/username. That directory cannot be modified by other group members. There is a shared directory at /blue/groupname/share for groups that prefer to share all their data between group members.

UFIT-RC only creates the user directory for your primary group. If you have secondary groups, you will need to create your own folder in that group's directory.

The parallel nature of the Blue Storage makes it very efficient at reading and writing large files, which can be 'striped' or broken into pieces to be stored on different storage servers.

It does not deal well with directories that have a large number of very small files. If a job produces those it is advisable to make use of the Temporary Directories to alleviate the burden on Blue Storage and make it more responsive and performant for everyone.

For groups that purchased separate storage for additional projects the default path to the project directories is /blue/PROJECT. That directory is set up similarly to the 'share' directory in the primary group directory tree.

Orange Storage

Quick notes on Orange storage

  • The Orange filesystem is primarily intended for archival purposes.
  • If an investment has been made, your group's Orange folder will be at /orange/groupname.
  • Quotas are based on investment and are at the group level.
  • Check your quota with the orange_quota command.

Orange storage is cheaper than Blue, but its hardware is also more limited. Therefore, orange storage cannot support the full brunt of the applications running on HiPerGator. Limit its use to long-term storage of data that's not currently in use or for very gentle access like serial reading of raw data for QC/Filtering with the output of that first step in many workflows going to your Blue Storage directory tree.

We only create the /orange/groupname directory when a new quota is added (no user or share directories like the ones pre-created for a group in /blue). Users in a group are expected to work out their own approach to storing and sharing data in their /orange directory tree.

Red Storage

Red storage is fully flash based and can support high rates of i/o. The point to remember about Red storage is that the allocations are short-term and the data is removed within 24 hrs of the allocation's end date. See the policy page for how to request an allocation.

Storage Backup

Unless purchased separately, NOTHING is backed up!

Storage backup is available as an option, but Home, Blue, Orange and Red do not provide any backup mechanism.

It is unfortunate, but users regularly accidentally delete important files.

Either invest in tape backup or keep your own backups of important files!

Storage Automounting

Many directories on HiPerGator make use of automounting. That means that the directory doesn't exist on the server until it is accessed by a user. This means that if you get a listing of /blue or /orange, your group's directory may not appear! This can be concerning for users! Fear not, change directories into the group directory and it will be mounted and ready for use.

If you are using Jupyter Notebook or other GUI or web applications that make it difficult to browse to a specific path you can create a symlink (shortcut) as shown in Create the Link

Local Scratch Storage or Temporary Directories

Quick notes on temporary directories

  • Use $SLURM_TMPDIR for a job's temporary directory.

All HiPerGator compute nodes have local storage. That storage is flash-based on HPG3 and newer nodes and can support high i/o rates. Older nodes use spinning disks with lower i/o rates compared to flash.

Using local scratch storage on HiPerGator compute nodes Temporary Directories is a way to insulate an analysis from most of the other jobs running on HiPerGator, which are generally using Blue storage. Therefore it may be possible to use $SLURM_TMPDIR to get much higher i/o rates as the job competes for local scratch i/o with a limited number of jobs running on the same compute node that also chose to use local scratch.

The caveat is that using local scratch requires staging out input data (copying it from /blue to $SLURM_TMPDIR within a job) and staging in the results (copying results files back to /blue) since the job's temporary scratch directory is automatically removed at the end of the job, so the files on it are irretrievably lost.

Checking Quotas and Managing Space

The ufrc environment module has several tools useful for checking storage use and quotas as well as exploring directories and their space use.

  • home_quota - show your HiPerGator Home directory quota usage.
  • blue_quota - show HiPerGator Blue Storage (/blue) quota usage for your user and group.
  • orange_quota - show HiPerGator Orange Storage (/orange) quota usage for your project(s).
  • ncdu - an interactive program for showing directory sizes, browsing a directory tree, and removing files and directories in a terminal (ssh session).

Shared Work and Storage Management

Note that HiPerGator is a RedHat Enterprise Linux based cluster. Its main shared filesystems are based on Lustre. All filesystem management limitations follow from this setup.

The sponsor of a group on HiPerGator has the ultimate authority over any data produced by their group members within the limitations of the Linux kernel, linux filesystem permission model, and Lustre filesystem implementation of the POSIX Access Control Lists.

  • What this means in practical terms is that the sponsor can decide on any action pertaining to the disposition of the files under their control, but how a particular change can be implemented, if at all possible, can vary both in its scope, the amount of initial and maintenance effort required, and the support request timeline.
  • It's important to understand both the security model and limitations imposed by the system when considering what approaches to data management within a group or a project directory are possible.

In a default setup each primary or project group with a Blue storage quota will have a /blue/groupname top level directory, which contains individual user directories and a share sub-directory for collaborative projects. The initial permissions are set such that only individual users have write access to their personal directories and the share directory is group-writable. The default for groups is that group members have read access to other group member folders.

For the Orange filesystem the /orange/groupname directory is group-writable with no other directories or permissions created by default. Access to files in the /orange/groupname or /blue/groupname/share directories depends on individual umask settings or results of chmod commands to change permissions and are within the purview of the group members with no RC staff involvement.

  • It is not expected that other users are given write access to another group member's files outside of the share directory.
  • When an account is deleted because of inactivity or an explicit sponsor support request the default action is to move the user's personal directory to /blue/groupname/share and chown all files in Blue or Orange group directory trees that were owned by the user account in question to the username of the group's sponsor.
  • A sponsor opening a support request can ask for a different dispensation of the former group member's files.

If the default directory and permission configuration and the account removal procedure fits how the group operates no further changes are necessary. However, we have observed additional approaches to group data management as well as support requests for changes that can be contrary to system limitations or are difficult to implement and maintain and therefore may not be advisable even if somewhat feasible.

Collaborative Approaches

In a fully collaborative research group on HiPerGator all members are encouraged to set umask 007 in their ~/.bashrc file to make sure that write permissions are set on all files and directories created by group members.

  • This will allow all users in the group to manage (read, write, execute) all group files. If security against accidental file deletion is desirable the group is advised to purchase Tivoli Backup from UFIT ICT.

If there is a need to share files with members of other groups or external collaborators, multiple approaches are possible. The most straightforward way is to to share a directory via a Globus Collection. This approach will work for both HPG and external collaborators to make a copy of the data and permissions are controlled in Globus.

  • If a selected set of users from multiple HPG groups must work on a project the preferred approach is for a sponsor or sponsors of the project to request a project group creation and purchase a storage quota for the project. In that case all members of the project will be added to the project group as secondary members and will have manage project file in a manner similar to how the Blue share or Orange directory is managed.

If it is necessary to give access to the directory to a member of a different HPG group the most straightforward solution from system administration viewpoint is to add that user as a secondary member of the sharing group. However, this change gives the user access to all group-readable files and the ability to use the group's computational resources on HiPerGator. This may not be desirable for one reason or another. The request to add a user to a group should be made by, and will require approval of, the group's sponsor via a support request.

  • There are more complex situations. Unfortunately, there is no general mechanism in Linux to allow hierarchical access permissions on filesystems.
  • It may be possible to set Lustre filesystem ACLs on /blue and /orange directories to allow more complex permissions. For example, allowing a user to manage another user's files, which means they will be able to modify, rename, move, or remove files and directories they do not own or have no group level write access to. Using Lustre ACLs is not a straightforward approach and the ACL interactions with the Linux filesystem permissions may still preclude write access to some files and directories.
  • We are willing to make such changes or work with the group to determine a usable setfacl command. However, the success is not guaranteed and the ACL permissions may be lost or not applied to new files. Please use the RC Support System to get in touch if you really need to use filesystem ACLs and are having trouble or need group-wide settings.