Skip to content

khmer

Description

khmer website

khmer is a library and suite of command line tools for working with DNA sequence. It is primarily aimed at short-read sequencing data such as that produced by the Illumina platform.

Environment Modules

Run module spider khmer to find out what environment modules are available for this application.

Environment Variables

  • HPC_KHMER_DIR
  • HPC_KHMER_BIN
  • HPC_KHMER_LIB
  • HPC_KHMER_SANDBOX

Additional Usage Information

Use import khmer in your script or in an interactive Python session to begin using Khmer.

Available Scripts:

- abundance-dist.py  
- count-median.py  
- do-partition.sh  
- filter-abund.py  
- find-knots.py  
- load-into-counting.py  
- merge-partitions.py  
- normalize-by-median.py  
- partition-graph.py  
- annotate-partitions.py  
- count-overlap.py  
- extract-partitions.py  
- filter-stoptags.py  
- load-graph.py  
- make-initial-stoptags.py  
- normalize-by-kadian.py  
- normalize-by-min.py

Citation

If you use the khmer software, you must cite:

Crusoe et al., The khmer software package: enabling efficient sequence analysis. 2014. doi: 10.6084/m9.figshare.979190

If you use any of Khmer's published scientific methods, you should also cite the relevant paper(s) listed below:

  • Graph partitioning and/or compressible graph representation: The load-graph.py, partition-graph.py, find-knots.py, load-graph.py, and partition-graph.py scripts are part of the compressible graph representation and partitioning algorithms described in: Pell J, Hintze A, Canino-Koning R, Howe A, Tiedje JM, Brown CT Proc Natl Acad Sci U S A. 2012 Aug 14;109(33):13272-7 doi: 10.1073/pnas.1121464109 PMID: 22847406

  • Digital normalization: The normalize-by-median.py and count-median.py scripts are part of the digital normalization algorithm, described in: A Reference-Free Algorithm for Computational Normalization of Shotgun Sequencing Data Brown CT, Howe AC, Zhang Q, Pyrkosz AB, Brom TH arXiv:1203.4802 [q-bio.GN] http://arxiv.org/abs/1203.4802

  • K-mer counting: The abundance-dist.py, filter-abund.py, and load-into-counting.py scripts implement the probabilistic k-mer counting described in: These are not the k-mers you are looking for: efficient online k-mer counting using a probabilistic data structure Zhang Q, Pell J, Canino-Koning R, Howe AC, Brown CT. arXiv:1309.2975 [q-bio.GN] http://arxiv.org/abs/1309.2975

Categories

biology, ngs