ncbi-fcs¶
Description¶
The NCBI Foreign Contamination Screen (FCS) is a tool suite for identifying and removing contaminant sequences in genome assemblies. Contaminants are defined as sequences in a dataset that do not originate from the biological source organism and can arise from a variety of environmental and laboratory sources. FCS will help you remove contaminants from genomes before submission to GenBank. FCS inclides FCS-adaptor and FCS-GX. * GXDB_LOC - database directory
Environment Modules¶
Run module spider ncbi-fcs
to find out what environment modules are available for this application.
Environment Variables¶
- HPC_NCBIFCS_DIR - installation directory
- HPC_NCBIFCS_BIN - executables directory
Additional Usage Information¶
Note
-
unset PYTHONHOME after loading the ncbi-fcs module
-
Request at least 512 GiB memory to hold the database and accessory files.
-
Copy the full "gxdb" database to the slurm $TMPDIR and set the $GXDB_LOC environmental variable to point to $TMPDIR
Example Slurm job script:¶
#!/bin/bash
#SBATCH --job-name test
#SBATCH -o fcs.%j.out
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=12
#SBATCH --mem=520gb
#SBATCH -t 12:00:00
#SBATCH --mail-user=gatorlink@ufl.edu
#SBATCH --mail-type=ALL
pwd; hostname; date
module purge
module load ncbi-fcs/0.5.4
unset PYTHONHOME
# set envirnmental variable GX_NUM_CORES to use the requested number of CPUs
export GX_NUM_CORES=$SLURM_CPUS_PER_TASK
# copy database to temp dir created for the job
cp -r "$GXDB_LOC/gxdb" $TMPDIR/gxdb
# reset envirnmental variable GXDB_LOC to point to aboe location
export GXDB_LOC=$TMPDIR
cd $SLURM_SUBMIT_DIR
# create output directory and screen genome
mkdir -p outdir
fcs.py screen genome --fasta genome.fa --tax-id taxid --gx-db "$GXDB_LOC/gxdb" --out-dir outdir
Categories¶
genomics