ksnp¶
Description¶
SNP identifies the pan-genome SNPs in a set of genome sequences, and estimates phylogenetic trees based upon those SNPs. SNP discovery is based on k-mer analysis, and requires no multiple sequence alignment or the selection of a reference genome, so kSNP can take 100's of microbial genomes as input. A SNP locus is defined by an oligo of length k surrounding a central SNP allele. kSNP can analyze both complete (finished) genomes and unfinished genomes in assembled contigs or raw, unassembled reads. Finished and unfinished genomes can be analyzed together, and kSNP can automatically download Genbank files of the finished genomes and incorporate the information in those files into the SNP annotation.
Environment Modules¶
Run module spider ksnp
to find out what environment modules are available for this application.
Environment Variables¶
- HPC_KSNP_DIR - installation directory
- HPC_KSNP_BIN - executable directory
- HPC_KSNP_DOC - documentation directory
Citation¶
If you publish research that uses Ksnp you have to cite it as follows:
Gardner, S.N., T. Slezak, and B.G. Hall. 2015. kSNP3.0: SNP detection and phylogenetic analysis of genomes without genome alignment or reference genomes. Bioinformatics 31: 2877-2878 doi: 10.1093/bioinformatics/btv271.
Gardner, S.N. and Hall, B.G. 2013. When whole-genome alignments just won't work: kSNP v2 software for alignment-free SNP discovery and phylogenetics of hundreds of microbial genomes. PLoS ONE, 8(12):e81760.doi:10.1371/journal.pone.0081760
Gardner, S.N. and Slezak, T.R. 2010. Scalable SNP analyses of 100+ bacterial or viral genomes. Journal of Forensic Research, 1:107.
Categories¶
biology, phylogenetics, variant_calling