Using Data on HiPerGator¶

A big part of using HiPerGator involves interacting with data: transferring, managing, and accessing your research files efficiently and securely. Because HiPerGator supports large-scale, high-performance computing workloads, moving data to and from the system is a critical step in the research workflow. Whether you are preparing input files for simulations, uploading large datasets for analysis, or downloading results for further processing, having reliable and efficient data transfer methods is essential to your success.

See NIH GDS Data if your data falls under the current NIH Genomic Data Sharing agreement.

Data Transfer¶

HiPerGator provides multiple tools and protocols designed to streamline the data transfer proccess. From command-line utilities to graphical interfaces, these options help you work with data in a way that fits your workflow and technical comfort level.

Key topics covered on this page include:

Globus: A high-performance, secure file transfer service optimized for moving large datasets quickly and reliably across networks and institutions.
Samba: A protocol that enables sharing and accessing files between Windows and HiPerGator systems, allowing seamless integration with Windows file-sharing environments.
Rsync: A command-line utility for efficient file synchronization and transfer, minimizing data transfer by copying only changed parts of files.
Sharing Within A Cluster: Methods to share data and collaborate with other users within the HiPerGator cluster environment, facilitating team-based research workflows.

Reference Data¶

The UFIT Research Computing (UFRC) team provides centralized access to a curated repository of reference data on HiPerGator. This shared resource is designed to enhance researcher efficiency, reduce redundant data storage, and optimize filesystem usage. By hosting common reference datasets, UFRC eliminates the need for individual research groups to use their Blue or Orange storage quotas for frequently used data. Users can request new reference data or shared directories by submitting a support ticket.

We also provide reference AI datasets available to all HiPerGator users. This centralized collection streamlines workflows, saves storage space, and reduces costs by eliminating the need for research groups to host their own local copies. As with general reference data, users may request additions or shared directories by submitting a support ticket.

Documentation NIH GDS Data on HiPerGator if you are looking for information on how to manage Restricted-Access data from NIH.