Skip to content

llama

Description

llama website

Compared to Llama 2, Llama3 made several key improvements. Llama 3 uses a tokenizer with a vocabulary of 128K tokens that encodes language much more efficiently, which leads to substantially improved model performance. To improve the inference efficiency of Llama 3 models, the Meta team have adopted grouped query attention (GQA) across both the 8B and 70B sizes.

Environment Modules

Run module spider llama to find out what environment modules are available for this application.

Environment Variables

Categories

library, math