Skip to content

SHAPEIT4

Description

shapeit4 website

A fast and accurate method for estimation of haplotypes (aka phasing) for SNP array and high coverage sequencing data. The version 4 is a refactored and improved version of the SHAPEIT algorithm

Environment Modules

Run module spider shapeit4 to find out what environment modules are available for this application.

Environment Variables

  • HPC_SHAPEIT4_DIR - installation directory
  • HPC_SHAPEIT4_BIN - executable directory

Job Script Examples

Below is a sample job script using SHAPEIT4

#!/bin/bash
#SBATCH --job-name=shapeit4_test
#SBATCH --mail-type=NONE
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=32
#SBATCH --mem-per-cpu=4gb
#SBATCH --time=24:00:00
#SBATCH --output=shapeit4_test.log

echo "Setting up test environment..."
TEST_PWD=/data/apps/tests/shapeit4
TEST_SAMPLEDIR=${TEST_PWD}/example_data
TEST_WORKDIR=${TEST_PWD}/output

cd ${TEST_PWD}
module load shapeit4

# Remove any previous test results, create a working directory, and copy
# initial test reads into the expected position in working directory
if [ -d ${TEST_WORKDIR} ]; then rm -rf ${TEST_WORKDIR}/; fi
mkdir ${TEST_WORKDIR}

echo "Starting test run at $(date) on $(hostname)..."

shapeit4 \
    --input ${TEST_SAMPLEDIR}/unphased.vcf.gz \
    --map ${TEST_SAMPLEDIR}/chr20.b37.gmap.gz \
    --region 20 \
    --output ${TEST_WORKDIR}/phased.vcf.gz \
    --thread ${SLURM_CPUS_PER_TASK:-1}

# Test with BDF files...
shapeit4 \
    --input ${TEST_SAMPLEDIR}/unphased.bcf \
    --map ${TEST_SAMPLEDIR}/chr20.b37.gmap.gz \
    --region 20 \
    --output ${TEST_WORKDIR}/phased.bcf \
    --thread ${SLURM_CPUS_PER_TASK:-1}

# There should be some files in the work directory
echo "There should be some results listed below:"
find ${TEST_WORKDIR} -type f ! -empty -ls

echo "Test complete at $(date)."

Citation

If you publish research that uses SHAPEIT4 you have to cite it as follows:

Olivier Delaneau, Jean-Francois Zagury, Matthew R Robinson, Jonathan Marchini, Emmanouil Dermitzakis. Accurate, scalable and integrative haplotype estimation. Nat. Comm. 2019.

Categories

biology, sequencing