Skip to content

uproc

Description

uproc website

With rapidly increasing volumes of biological sequence data the functional analysis of new sequences in terms of similarities to known protein families challenges classical bioinformatics. The ultrafast protein classification (UProC) toolbox implements a novel algorithm ("Mosaic Matching") for large-scale sequence analysis and is now available in terms of an open source C library. UProC is up to three orders of magnitude faster than profile-based methods and achieved up to 80% higher sensitivity on unassembled short reads (100 bp) from simulated metagenomes. UProC does not depend on a multiple alignment of family-specific sequences. Therefore, in addition to the protein domain classfication according to the Pfam database, UProC can, in principle, also provide the detection of KEGG Orthologs. A precompiled database for KEGG Ortholog classification is provided but so far we have not evaluated the classification performance for that database.

Environment Modules

Run module spider uproc to find out what environment modules are available for this application.

Environment Variables

  • HPC_UPROC_DIR - installation directory
  • HPC_UPROC_BIN - executable directory
  • HPC_UPROC_LIB - library directory
  • HPC_UPROC_INC - includes directory

Categories

biology, genomics