llama-alex¶

Description¶

Compared to Llama 2, Llama3 made several key improvements. Llama 3 uses a tokenizer with a vocabulary of 128K tokens that encodes language much more efficiently, which leads to substantially improved model performance. To improve the inference efficiency of Llama 3 models, the Meta team have adopted grouped query attention (GQA) across both the 8B and 70B sizes.

Environment Modules¶

Run module spider llama to find out what environment modules are available for this application.

Environment Variables¶

HPC_LLAMA_DIR - installation directory
HPC_LLAMA_BIN - executable directory

Categories¶

library, math