khmer¶

Description¶

khmer is a library and suite of command line tools for working with DNA sequence. It is primarily aimed at short-read sequencing data such as that produced by the Illumina platform.

Environment Modules¶

Run module spider khmer to find out what environment modules are available for this application.

Environment Variables¶

HPC_KHMER_DIR
HPC_KHMER_BIN
HPC_KHMER_LIB
HPC_KHMER_SANDBOX

Additional Usage Information¶

Use import khmer in your script or in an interactive Python session to begin using Khmer.

Available Scripts:

- abundance-dist.py  
- count-median.py  
- do-partition.sh  
- filter-abund.py  
- find-knots.py  
- load-into-counting.py  
- merge-partitions.py  
- normalize-by-median.py  
- partition-graph.py  
- annotate-partitions.py  
- count-overlap.py  
- extract-partitions.py  
- filter-stoptags.py  
- load-graph.py  
- make-initial-stoptags.py  
- normalize-by-kadian.py  
- normalize-by-min.py

Citation¶

If you use the khmer software, you must cite:

Crusoe et al., The khmer software package: enabling efficient sequence analysis. 2014. doi: 10.6084/m9.figshare.979190

If you use any of Khmer's published scientific methods, you should also cite the relevant paper(s) listed below:

Graph partitioning and/or compressible graph representation: The load-graph.py, partition-graph.py, find-knots.py, load-graph.py, and partition-graph.py scripts are part of the compressible graph representation and partitioning algorithms described in: Pell J, Hintze A, Canino-Koning R, Howe A, Tiedje JM, Brown CT Proc Natl Acad Sci U S A. 2012 Aug 14;109(33):13272-7 doi: 10.1073/pnas.1121464109 PMID: 22847406
Digital normalization: The normalize-by-median.py and count-median.py scripts are part of the digital normalization algorithm, described in: A Reference-Free Algorithm for Computational Normalization of Shotgun Sequencing Data Brown CT, Howe AC, Zhang Q, Pyrkosz AB, Brom TH arXiv:1203.4802 [q-bio.GN] http://arxiv.org/abs/1203.4802
K-mer counting: The abundance-dist.py, filter-abund.py, and load-into-counting.py scripts implement the probabilistic k-mer counting described in: These are not the k-mers you are looking for: efficient online k-mer counting using a probabilistic data structure Zhang Q, Pell J, Canino-Koning R, Howe AC, Brown CT. arXiv:1309.2975 [q-bio.GN] http://arxiv.org/abs/1309.2975

Categories¶

biology, ngs