BEDTools¶
Description¶
The BEDTools utilities allow one to address common genomics tasks such as finding feature overlaps and computing coverage. The utilities are largely based on four widely-used file formats: BED, GFF/GTF, VCF, and SAM/BAM. Using BEDTools, one can develop sophisticated pipelines that answer complicated research questions by "streaming" several BEDTools together. The following are examples of common questions that one can address with BEDTools.
- Intersecting two BED files in search of overlapping features. 2. Culling/refining/computing coverage for BAM alignments based on genome features. 3. Merging overlapping features. 4. Screening for paired-end (PE) overlaps between PE sequences and existing genomic features. 5. Calculating the depth and breadth of sequence coverage across defined "windows" in a genome. 6. Screening for overlaps between "split" alignments and genomic features.
The fact that all of the BEDTools accept input from “standard input (stdin)” allows one to “stream / pipe” several commands together to facilitate more complicated analyses. Also, the tools allow fine control over how output is reported. Most recently, I have added support for sequence alignments in BAM (http://samtools.sourceforge.net/) format, as well as for features in VCF and GFF, as well as “blocked” BED format. The tools are quite fast and typically finish in a matter of a few seconds, even for large datasets.
Environment Modules¶
Run module spider bedtools
to find out what environment modules are available for this application.
Environment Variables¶
- HPC_BEDTOOLS_DIR - installation directory
- HPC_BEDTOOLS_BIN - executable directory
- HPC_BEDTOOLS_DATA - data directory
- HPC_BEDTOOLS_GENOMES - genomes directory
- HPC_BEDTOOLS_SCRIPTS - genomes directory
Categories¶
biology, ngs