AI Models¶

The UFIT Research Computing AI Support Team maintains a suite of commonly used AI models on HiPerGator. Users may copy these models to their own space, make modifications, and follow the instructions provided to run jobs on HiPerGator. Each model directory includes a README file with additional information.

For assistance with these models or AI-related questions, submit a ticket via UFIT Research Computing Support.

AI Models on HiPerGator¶

dirpath	dirsize	name	name_url	version	url	license_text	license	license_url	date	categories	description
/data/ai/models/computer_vision/ultralytics_yolov8	605.2 MiB	Ultralytics YOLO	Ultralytics YOLO	v8	https://github.com/ultralytics/ultralytics	AGPL-3.0 License	AGPL-3.0 License: This OSI-approved open-source license is ideal for students and enthusiasts and Enterprise License: Designed for commercial use	https://github.com/ultralytics/ultralytics?tab=readme-ov-file#license	5-May-24	Computer vision	YOLOv8 Detect, Segment and Pose models pretrained on the COCO dataset are available here, as well as YOLOv8 Classify models pretrained on the ImageNet dataset. Track mode is available for all Detect, Segment and Pose models.
/data/ai/models/healthcare_life_science/proteinfolding/alphafold	8.7 GiB	alphafold	alphafold	v2.0.0	https://github.com/deepmind/alphafold	Apache License 2.0	Apache License 2.0	nan	6-Jul-22	Healthcare and life science	Predicts protein structures. If you publish research using alphafold, the original paper must be cited.
/data/ai/models/healthcare_life_science/proteinfolding/rosettafold	1.0 GiB	RoseTTAFold	RoseTTAFold	v1.0.0	https://github.com/RosettaCommons/RoseTTAFold	MIT License	MIT License	https://github.com/RosettaCommons/RoseTTAFold/blob/main/LICENSE	11-Mar-21	Healthcare and life science	Predicts protein structures. If you publish research using RoseTTAFold, the original paper must be cited https://www.biorxiv.org/content/10.1101/2021.06.14.448402v
/data/ai/models/nvidia/stylegan3	7.2 GiB	StyleGAN	StyleGAN	3	https://catalog.ngc.nvidia.com/orgs/nvidia/teams/research/models/stylegan3	Nvidia Source Code License	Nvidia Source Code License	https://github.com/NVlabs/stylegan3/blob/main/LICENSE.txt	29-Apr-24	Imaging	StyleGAN3 is a cutting-edge generative model for high-quality image synthesis, offering unparalleled control over image style and content, making it ideal for creative and enterprise applications.
/data/ai/models/multimodel/clip/clip-vit-base-patch32	3.4 GiB	CLIP	CLIP	openai/clip-vit-base-patch32	https://huggingface.co/openai/clip-vit-base-patch32	Apache License 2.0	Apache License 2.0	nan	17-Jul-23	Multimodal	The clip-vit-base-patch32 uses a ViT-B/32 Transformer architecture as an image encoder and uses a masked self-attention Transformer as a text encoder. These encoders are trained to maximize the similarity of (image, text) pairs via a contrastive loss.
/data/ai/models/multimodel/clip/BiomedCLIP-PubMedBERT_256-vit_base_patch16_224	1.5 GiB	BiomedCLIP	BiomedCLIP	microsoft/BiomedCLIP-PubMedBERT_256-vit_base_patch16_224	https://huggingface.co/microsoft/BiomedCLIP-PubMedBERT_256-vit_base_patch16_224	MIT License	MIT License	https://huggingface.co/datasets/choosealicense/licenses/blob/main/markdown/mit.md	17-Jul-23	Multimodal	BiomedCLIP is a biomedical vision-language foundation model that is pretrained on PMC-15M, a dataset of 15 million figure-caption pairs extracted from biomedical research articles in PubMed Central, using contrastive learning. It uses PubMedBERT as the text encoder and Vision Transformer as the image encoder, with domain-specific adaptations. It can perform various vision-language processing (VLP) tasks such as cross-modal retrieval, image classification, and visual question answering.
/data/ai/models/nlp/gemma	250.1 GiB	Gemma	Gemma	nan	https://ai.google.dev/gemma	gemma	gemma	https://ai.google.dev/gemma/terms	9-Apr-24	NLP	Gemma is a family of lightweight, state-of-the-art open models built from the same research and technology used to create the Gemini models. Developed by Google DeepMind and other teams across Google, Gemma is named after the Latin gemma, meaning "precious stone."
/data/ai/models/nlp/llama	6.7 TiB	Llama	Llama	Llama2, Llama3	https://llama.meta.com/	llama	llama	https://llama.meta.com/llama2/license/	19-Apr-24	NLP	LLaMA models are powerful language models developed by Meta AI, with the latest version being LLaMA 3, which significantly improves performance and accessibility for various natural language processing tasks.
/data/ai/models/nlp/meditron	564.2 GiB	Meditron	Meditron	nan	https://github.com/epfLLM/meditron	llama	llama	https://huggingface.co/meta-llama/Llama-2-7b-chat-hf/blob/main/LICENSE.txt	3-May-24	NLP	Meditron is a suite of open-source medical Large Language Models (LLMs). The team provide Meditron-7B and Meditron-70B, fine-tuned for medical tasks using a diverse medical dataset. Among these, Meditron-70B shows superior performance compared to other models like Llama-2-70B, GPT-3.5, and Flan-PaLM across multiple medical reasoning tasks.
/data/ai/models/nlp/megatron	22.3 GiB	Megatron-LM	Megatron-LM	2.2, 2.5, 3.0	https://github.com/NVIDIA/Megatron-LM	Apache License 2.0	Apache License 2.0	https://github.com/NVIDIA/Megatron-LM/blob/main/LICENSE	nan	NLP	Megatron-LM, a fascinating language model developed by the Applied Deep Learning Research team at NVIDIA.
/data/ai/models/nlp/mistral_ai	875.6 GiB	Mistral AI	Mistral AI	nan	https://mistral.ai/	Apache License 2.0	Apache License 2.0	nan	9-Apr-24	NLP	Mistral AI offers a variety of language models, including open-weights models like Mistral 7B, Mixtral 8x7B, and Mixtral 8x22B, as well as optimized commercial models such as Mistral Small, Mistral Medium, Mistral Large, and Mistral Embeddings
/data/ai/models/nvidia/bionemo/dnabert	64.3 GiB	DNABERT	DNABERT	1.2	https://catalog.ngc.nvidia.com/orgs/nvidia/teams/clara/models/dnabert	nan	0	nan	28-Feb-24	NLP	DNABERT generates a dense representation of a genome sequence by identifying contextually similar sequences in the human genome. DNABert is a DNA sequence model trained on sequences from the human reference genome Hg38.p13.
/data/ai/models/nvidia/nemo/nemo_24.01.gemma	21.4 GiB	Nemo_24.01_Gemma	Nemo_24.01_Gemma	nan	https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo/tags	NVIDIA AI Product Agreement	NVIDIA AI Product Agreement	https://www.nvidia.com/en-us/data-center/products/nvidia-ai-enterprise/eula/	18-Apr-24	NLP	NeMo framework container with the pre-trained model Gemma.
/data/ai/models/nvidia/nemo/nemo_24.01.starcoder2	22.6 GiB	Nemo_24.03_StarCoder2	Nemo_24.03_StarCoder2	2	https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo/tags	NVIDIA AI Product Agreement	NVIDIA AI Product Agreement	https://www.nvidia.com/en-us/data-center/products/nvidia-ai-enterprise/eula/	18-Apr-24	NLP	NeMo framework container with the pre-trained model StarCoder2.
/data/ai/models/nvidia/nemo/nemo_24.03.codegemma	20.2 GiB	Nemo_24.03_CodeGemma	Nemo_24.03_CodeGemma	nan	https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo/tags	NVIDIA AI Product Agreement	NVIDIA AI Product Agreement	https://www.nvidia.com/en-us/data-center/products/nvidia-ai-enterprise/eula/	18-Apr-24	NLP	NeMo framework container with the pre-trained model CodeGemma.
/data/ai/models/nlp/gatortron	50.2 GiB	Gatortron	Gatortron	nan	https://huggingface.co/UFNLP	Apache License 2.0	Apache License 2.0	nan	3-May-24	NLP	GatorTron is a large clinical language model developed by researchers at the University of Florida Health in collaboration with NVIDIA. Its designed to accelerate research and medical decision-making by extracting insights from massive volumes of clinical data with unprecedented speed and clarity.