Skip to content

ea-utils

Description

eautils website

EAUtils are command-line tools for processing biological sequencing data. Barcode demultiplexing, adapter trimming, etc. They are primarily written to support an Illumina based pipeline - but should work with any FASTQs.

Environment Modules

Run module spider eautils to find out what environment modules are available for this application.

Environment Variables

  • HPC_EAUTILS_DIR - installation directory

Additional Usage Information

Command Action
fastq-mcf Scans a sequence file for adapters, and, based on a log-scaled threshold, determines a set of clipping parameters and performs clipping. Also does skewing detection and quality filtering.
fastq-multx Demultiplexes a fastq. Capable of auto-determining barcode id's based on a master set fields. Keeps multiple reads in-sync during demultiplexing. Can verify that the reads are in-sync as well, and fail if they're not.
fastq-join Similar to audy's stitch program, but in C, more efficient and supports some automatic benchmarking and tuning. It uses the same "squared distance for anchored alignment" as other tools.
varcall Takes a pileup and calculates variants in a more easily parameterized manner than some other tools.
sam-stats Basic sam/bam stats. Like other tools, but produces what I want to look at, in a format suitable for passing to other programs. (View source)
fastq-stats Basic fastq stats. Counts duplicates. Option for per-cycle stats, or not (irrelevant for many sequencers).

Citation

If you publish research that uses eautils you have to cite it as follows:

Erik Aronesty (2011). ea-utils : "Command-line tools for processing biological sequencing data"; https://expressionanalysis.github.io/ea-utils/

Erik Aronesty (2013). TOBioiJ : "Comparison of Sequencing Utility Programs", http://doi.org/10.2174/1875036201307010001

Categories

biology, genomics, ngs