Skip to content

clean

Description

clean website

CLEAN, Contrastive Learning enabled Enzyme ANnotation, is a machine learning algorithm to assign Enzyme Commission (EC) number with better accurary, reliability, and sensitivity than all exisiting compuational tools.

Environment Modules

Run module spider clean to find out what environment modules are available for this application.

Environment Variables

  • HPC_CLEAN_DIR - installation directory
  • HPC_CLEAN_BIN - executable directory

Additional Usage Information

When loading the module CLEAN, the environment is loaded with necessary dependencies to run CLEAN_infer_fasta.py. Users should still clone the CLEAN repo locally from:

https://github.com/tttianhao/CLEAN

to their work directory. Once cloned, the user should also make a new directory inside the CLEAN repo using the following command.

git clone https://github.com/facebookresearch/esm.git; mkdir data/esm_data

Next, run for the first time

python CLEAN_infer_fasta.py --fasta_data price
To work with FASTA files, download the provided files from the repo and move them to data/pretrained.

Citation

Tianhao Yu and Haiyang Cui and Jianan Canal Li and Yunan Luo and Guangde Jiang and Huimin Zhao. Enzyme function prediction using contrastive learning. Science. 379. 6639. 1358-1363. 2023. 10.1126/science.adf2465. https://www.science.org/doi/abs/10.1126/science.adf2465

Categories

protein, machine_learning