Skip to content

ncbi-fcs

Description

ncbi-fcs website

The NCBI Foreign Contamination Screen (FCS) is a tool suite for identifying and removing contaminant sequences in genome assemblies. Contaminants are defined as sequences in a dataset that do not originate from the biological source organism and can arise from a variety of environmental and laboratory sources. FCS will help you remove contaminants from genomes before submission to GenBank. FCS inclides FCS-adaptor and FCS-GX. * GXDB_LOC - database directory

Environment Modules

Run module spider ncbi-fcs to find out what environment modules are available for this application.

Environment Variables

  • HPC_NCBIFCS_DIR - installation directory
  • HPC_NCBIFCS_BIN - executables directory

Additional Usage Information

Note

  • unset PYTHONHOME after loading the ncbi-fcs module

  • Request at least 512 GiB memory to hold the database and accessory files.

  • Copy the full "gxdb" database to the slurm $TMPDIR and set the $GXDB_LOC environmental variable to point to $TMPDIR

Example Slurm job script:

#!/bin/bash
#SBATCH --job-name test
#SBATCH -o fcs.%j.out
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=12
#SBATCH --mem=520gb
#SBATCH -t 12:00:00
#SBATCH --mail-user=gatorlink@ufl.edu
#SBATCH --mail-type=ALL

pwd; hostname; date

module purge
module load ncbi-fcs/0.5.4

unset PYTHONHOME

# set envirnmental variable GX_NUM_CORES to use the requested number of CPUs 
export GX_NUM_CORES=$SLURM_CPUS_PER_TASK

# copy database to temp dir created for the job
cp -r "$GXDB_LOC/gxdb" $TMPDIR/gxdb

# reset envirnmental variable GXDB_LOC to point to aboe location
export GXDB_LOC=$TMPDIR

cd $SLURM_SUBMIT_DIR

# create output directory and screen genome
mkdir -p outdir
fcs.py screen genome --fasta genome.fa --tax-id taxid --gx-db "$GXDB_LOC/gxdb" --out-dir outdir

Categories

genomics