MCL¶
Description¶
The MCL algorithm is short for the Markov Cluster Algorithm, a fast and scalable unsupervised cluster algorithm for graphs (also known as networks) based on simulation of (stochastic) flow in graphs. The algorithm was invented/discovered by Stijn van Dongen (that is, me) at the Centre for Mathematics and Computer Science (also known as CWI) in the Netherlands. The PhD thesis Graph clustering by flow simulation is centered around this algorithm, the main topics being the mathematical theory behind it, its position in cluster analysis and graph clustering, issues concerning scalability, implementation, and benchmarking, and performance criteria for graph clustering in general. The work for this thesis was carried out under supervision of Jan van Eijck and Michiel Hazewinkel. The thesis, technical reports, and preprints can be found in this section. For quickly getting an idea of how MCL operates, consider the flow pictorial at the top of this page, or even better, have a look at an animation of the MCL process.
The basic interface to the algorithm is very simple - you need only one option (the -I flag) to get to the heart of it, and for large graphs you should also be aware of the -scheme flag for regulating resources. The default approach is to vary the argument to -I over some interval (doing an mcl run for each value), and analyze the clustering output with the other programs that come with MCL (cf the mcl manuals).
Environment Modules¶
Run module spider mcl to find out what environment modules are available for this application.
Environment Variables¶
- HPC_MCL_DIR - installation directory
- HPC_MCL_BIN - executable directory
- HPC_MCL_CONF - configuration file directory
Categories¶
biology, genomics