Skip to content

impg

Description

impg website

IMPG (Implicit Pangenome Graph) projects sequence ranges through many-way (e.g. all-vs-all) pairwise alignments built by tools like wfmash and minimap2. At its core, impg lifts over ranges from a target sequence into the other genomes described in alignments. In effect, it lets us pick up homologous loci from all genomes mapped onto our specific target region. This is particularly useful when you're interested in comparing a specific genomic region across different individuals, strains, or species in a pangenomic or comparative genomic setting. The output is provided in BED format, making it straightforward to use to extract FASTA sequences for downstream use in multiple sequence alignment (like mafft) or pangenome graph building (e.g., pggb or minigraph-cactus). impg uses coitrees (implicit interval trees) to provide efficient range lookup over the input alignments. CIGAR strings are converted to a compact delta encoding. This approach allows for fast and memory-efficient projection of sequence ranges through alignments.

Environment Modules

Run module spider impg to find out what environment modules are available for this application.

Environment Variables

  • HPC_IMPG_DIR - installation directory
  • HPC_IMPG_BIN - executable directory

Categories

biology, alignment