llama¶
Description¶
Compared to Llama 2, Llama3 made several key improvements. Llama 3 uses a tokenizer with a vocabulary of 128K tokens that encodes language much more efficiently, which leads to substantially improved model performance. To improve the inference efficiency of Llama 3 models, the Meta team have adopted grouped query attention (GQA) across both the 8B and 70B sizes.
Environment Modules¶
Run module spider llama
to find out what environment modules are available for this application.
Environment Variables¶
Categories¶
library, math