ea-utils¶
Description¶
EAUtils are command-line tools for processing biological sequencing data. Barcode demultiplexing, adapter trimming, etc. They are primarily written to support an Illumina based pipeline - but should work with any FASTQs.
Environment Modules¶
Run module spider eautils
to find out what environment modules are available for this application.
Environment Variables¶
- HPC_EAUTILS_DIR - installation directory
Additional Usage Information¶
Command | Action |
---|---|
fastq-mcf | Scans a sequence file for adapters, and, based on a log-scaled threshold, determines a set of clipping parameters and performs clipping. Also does skewing detection and quality filtering. |
fastq-multx | Demultiplexes a fastq. Capable of auto-determining barcode id's based on a master set fields. Keeps multiple reads in-sync during demultiplexing. Can verify that the reads are in-sync as well, and fail if they're not. |
fastq-join | Similar to audy's stitch program, but in C, more efficient and supports some automatic benchmarking and tuning. It uses the same "squared distance for anchored alignment" as other tools. |
varcall | Takes a pileup and calculates variants in a more easily parameterized manner than some other tools. |
sam-stats | Basic sam/bam stats. Like other tools, but produces what I want to look at, in a format suitable for passing to other programs. (View source) |
fastq-stats | Basic fastq stats. Counts duplicates. Option for per-cycle stats, or not (irrelevant for many sequencers). |
Citation¶
If you publish research that uses eautils you have to cite it as follows:
Erik Aronesty (2011). ea-utils : "Command-line tools for processing biological sequencing data"; https://expressionanalysis.github.io/ea-utils/
Erik Aronesty (2013). TOBioiJ : "Comparison of Sequencing Utility Programs", http://doi.org/10.2174/1875036201307010001
Categories¶
biology, genomics, ngs