_images/logo_condensed.png

OrthoHMM using high sensitivity and specificity Hidden Markov Models for orthology inference.

If you found OrthoHMM useful, please cite OrthoHMM: Improved Inference of Ortholog Groups using Hidden Markov Models. Steenwyk et al. 2024, bioRxiv. doi: 10.1101/2024.12.07.627370.


Performance

As of v0.2.0, OrthoHMM ships a built-in profile HMM + k-mer prefilter search engine that replaces the phmmer subprocess. It scales to 100 bacterial proteomes (~352K total proteins) on a single 32-core node:

proteomes

wall time

peak RAM

orthogroups

5

13 s

0.29 GB

1,196

20

4 min

0.44 GB

8,680

60

28 min

1.65 GB

19,029

100

77 min

4.67 GB

27,328

Numbers from the bacterial scaling benchmark (RefSeq, 32 threads). The legacy phmmer path is still available via --search_mode phmmer but is no longer the default.


Quick Start

1. Install external dependencies

OrthoHMM has one required external binary — mcl — used for the default clustering step. Install via your package manager (apt install mcl, brew install mcl, conda install -c bioconda mcl) or from source.

HMMER is optional and only required if you opt into the legacy --search_mode phmmer pipeline. If you’d rather avoid the mcl external dependency, --clustering leiden --cpm_resolution auto uses pure-Python igraph/leidenalg and is competitive on most inputs.


2. Install OrthoHMM

# install
pip install orthohmm
# run
orthohmm <path_to_directory_of_FASTA_files>

Below are more detailed instructions, including alternative installation methods.


1) Installation

If you are having trouble installing OrthoHMM, please contact the lead developer, Jacob L. Steenwyk, via |contactSteenwyk|_ or |blueskySteenwyk|_ to get help.

1. Install external dependencies

OrthoHMM has one required external binary — mcl — used for the default clustering step. Install via your package manager (apt install mcl, brew install mcl, conda install -c bioconda mcl) or from source.

HMMER is optional and only required if you opt into the legacy --search_mode phmmer pipeline. If you’d rather avoid the mcl external dependency, --clustering leiden --cpm_resolution auto uses pure-Python igraph/leidenalg and is competitive on most inputs.

2a. Install OrthoHMM from pip

To install using pip, we recommend building a virtual environment to avoid software dependency issues. To do so, execute the following commands:

# create virtual environment
python -m venv venv
# activate virtual environment
source venv/bin/activate
# install orthohmm
pip install orthohmm

Note, the virtual environment must be activated to use orthohmm.


Install from source

Similarly, to install from source, we strongly recommend using a virtual environment. To do so, use the following commands:

# download
git clone https://github.com/JLSteenwyk/orthohmm.git
cd orthohmm/
# create virtual environment
python -m venv venv
# activate virtual environment
source venv/bin/activate
# install
make install

To deactivate your virtual environment, use the following command:

# deactivate virtual environment
deactivate

Note, the virtual environment must be activated to use orthohmm.


2b. Install OrthoHMM from source

Similarly, to install from source, we recommend using a virtual environment. To do so, use the following commands:

git clone https://github.com/JLSteenwyk/orthohmm.git
cd orthohmm/
make install

If you run into permission errors when executing make install, create a virtual environemnt for your installation:

git clone https://github.com/JLSteenwyk/orthohmm.git
cd orthohmm/
python -m venv venv
source venv/bin/activate
make install

Note, the virtual environment must be activated to use orthohmm.


2) Usage

To use OrthoHMM in its simpliest form, execute the following command:

orthohmm <path_to_directory_of_FASTA_files>