Cactus¶
Description¶
Cactus is a reference-free whole-genome multiple alignment program. Please cite the Progressive Cactus paper when using Cactus. Additional descriptions of the core algorithms can be found here and here.
Environment Modules¶
Run module spider cactus
to find out what environment modules are available for this application.
Environment Variables¶
- HPC_CACTUS_DIR - installation directory
- HPC_CACTUS_BIN - executable directory
Job Script Examples¶
Using the "step-by-step" procedure with "cactus-prepare" has proven to be the most successful way to run the Cactus workflow on HiPerGator.
See below for a cactus job script example that was used for testing:
#!/bin/bash
#SBATCH --job-name=cactus_test
#SBATCH --mail-type=NONE
#SBATCH --cpus-per-task=12
#SBATCH --mem-per-cpu=10gb
#SBATCH --partition=gpu
#SBATCH --gpus=a100:1
#SBATCH --time=8:00:00
#SBATCH --output=cactus_test.log
echo "Setting up test environment..."
TEST_PWD=/data/apps/tests/cactus
TEST_DATADIR=${TEST_PWD}/example_data
TEST_WORKDIR=${TEST_PWD}/test_output
cd ${TEST_PWD}
module load cactus/2.0.5
# Remove any previous test results and re-create a working directory
if [ -d ${TEST_WORKDIR} ]; then rm -rf ${TEST_WORKDIR}/; fi
mkdir ${TEST_WORKDIR}
mkdir ${TEST_WORKDIR}/workdir
mkdir ${TEST_WORKDIR}/steps-output
cd ${TEST_WORKDIR}
echo "Starting test run at $(date) on $(hostname)..."
# export TOIL_SLURM_ARGS="--partition gpu --gpus=a100:1"
# unset XDG_RUNTIME_DIR
# The "step-by-step" method below will produce a file with a list of cactus
# commands, each of which can be modified (if needed) and placed in its own
# separate job
cactus-prepare \
${TEST_DATADIR}/evolverMammals.txt \
--jobStore ${TEST_WORKDIR}/jobstore \
--outDir ${TEST_WORKDIR}/steps-output \
--outHal ${TEST_WORKDIR}/steps-output/evolverMammals.hal \
--outSeqFile ${TEST_WORKDIR}/steps-output/evolverMammals.txt \
--gpu \
> ${TEST_WORKDIR}/steps.sh
# For testing the installation, the following command blindly runs the steps
# generated by cactus-prepare. In real life, the user would likely modify the
# "steps.sh" file options and break up the rounds into separate SLURM jobs.
echo "Test complete at $(date)."
/bin/bash ${TEST_WORKDIR}/steps.sh
Categories¶
phylogenetics