Skip to content

textract

Description

textract website

Textract provides a single interface for extracting embedded content from any type of file, without any irrelevant markup, for further textual analysis and visualization.

Environment Modules

Run module spider textract to find out what environment modules are available for this application.

Environment Variables

  • HPC_TEXTRACT_DIR - installation directory
  • HPC_TEXTRACT_BIN - executable directory

Categories

data_science, file_management