khmer¶
Description¶
khmer is a library and suite of command line tools for working with DNA sequence. It is primarily aimed at short-read sequencing data such as that produced by the Illumina platform.
Environment Modules¶
Run module spider khmer
to find out what environment modules are available for this application.
Environment Variables¶
- HPC_KHMER_DIR
- HPC_KHMER_BIN
- HPC_KHMER_LIB
- HPC_KHMER_SANDBOX
Additional Usage Information¶
Use import khmer
in your script or in an interactive Python session to begin using Khmer.
Available Scripts:
- abundance-dist.py
- count-median.py
- do-partition.sh
- filter-abund.py
- find-knots.py
- load-into-counting.py
- merge-partitions.py
- normalize-by-median.py
- partition-graph.py
- annotate-partitions.py
- count-overlap.py
- extract-partitions.py
- filter-stoptags.py
- load-graph.py
- make-initial-stoptags.py
- normalize-by-kadian.py
- normalize-by-min.py
Citation¶
If you use the khmer software, you must cite:
Crusoe et al., The khmer software package: enabling efficient sequence analysis. 2014. doi: 10.6084/m9.figshare.979190
If you use any of Khmer's published scientific methods, you should also cite the relevant paper(s) listed below:
-
Graph partitioning and/or compressible graph representation: The load-graph.py, partition-graph.py, find-knots.py, load-graph.py, and partition-graph.py scripts are part of the compressible graph representation and partitioning algorithms described in: Pell J, Hintze A, Canino-Koning R, Howe A, Tiedje JM, Brown CT Proc Natl Acad Sci U S A. 2012 Aug 14;109(33):13272-7 doi: 10.1073/pnas.1121464109 PMID: 22847406
-
Digital normalization: The normalize-by-median.py and count-median.py scripts are part of the digital normalization algorithm, described in: A Reference-Free Algorithm for Computational Normalization of Shotgun Sequencing Data Brown CT, Howe AC, Zhang Q, Pyrkosz AB, Brom TH arXiv:1203.4802 [q-bio.GN] http://arxiv.org/abs/1203.4802
-
K-mer counting: The abundance-dist.py, filter-abund.py, and load-into-counting.py scripts implement the probabilistic k-mer counting described in: These are not the k-mers you are looking for: efficient online k-mer counting using a probabilistic data structure Zhang Q, Pell J, Canino-Koning R, Howe AC, Brown CT. arXiv:1309.2975 [q-bio.GN] http://arxiv.org/abs/1309.2975
Categories¶
biology, ngs