PhyKIT, a toolkit for the UNIX shell environment with numerous functions that process multiple sequence alignments and phylogenies for broad applications
If you found PhyKIT useful, please cite PhyKIT: a broadly applicable UNIX shell toolkit for processing and analyzing phylogenomic data. Bioinformatics. doi: 10.1093/bioinformatics/btab096.
Quick Start
These two lines represent the simplest method to rapidly install and run PhyKIT.
# install
pip install phykit
# run
phykit -h
1) Installation
To install using pip, we strongly recommend building a virtual environment to avoid software dependency issues. To do so, execute the following commands:
# create virtual environment
python -m venv venv
# activate virtual environment
source venv/bin/activate
# install phykit
pip install phykit
Note, the virtual environment must be activated to use phykit.
After using PhyKIT, you may wish to deactivate your virtual environment and can do so using the following command:
# deactivate virtual environment
deactivate
Similarly, to install from source, we strongly recommend using a virtual environment. To do so, use the following commands:
# download
git clone https://github.com/JLSteenwyk/PhyKIT.git
cd PhyKIT/
# create virtual environment
python -m venv venv
# activate virtual environment
source venv/bin/activate
# install
make install
To deactivate your virtual environment, use the following command:
# deactivate virtual environment
deactivate
Note, the virtual environment must be activated to use phykit.
To install via anaconda, execute the following command:
conda install bioconda::phykit
Visit here for more information: https://anaconda.org/bioconda/phykit
2) Usage
Get the help message from PhyKIT:
phykit -h
- About
- Usage
- General usage
- Functions by analytical category
- Alignment-based functions
- Alignment entropy
- Alignment length
- Alignment length no gaps
- Alignment outlier taxa
- Alignment recoding
- Column score
- Composition per taxon
- Compositional bias per site
- Create concatenation matrix
- Evolutionary Rate per Site
- Faidx
- Guanine-cytosine (GC) content
- Mask alignment
- Occupancy per taxon
- Pairwise identity
- Parsimony informative sites
- Plot alignment QC
- Protein-to-nucleotide alignment
- Relative composition variability
- Relative composition variability, taxon
- Rename FASTA entries
- Sum-of-pairs score
- Variable sites
- Tree-based functions
- Ancestral state reconstruction
- Concordance-aware ancestral state reconstruction
- Bipartition support statistics
- Branch length multiplier
- Collapse bipartitions
- Consensus network
- Consensus tree
- Continuous trait evolution model comparison (fitContinuous)
- Continuous trait mapping (contMap)
- Cophylogenetic plot (tanglegram)
- Covarying evolutionary rates
- Degree of violation of the molecular clock
- Density map
- Evolutionary tempo mapping
- Discordance asymmetry
- Evolutionary rate
- Hidden paralogy check
- Internal branch statistics
- Internode labeler
- Last common ancestor subtree
- Lineage-through-time plot and gamma statistic
- Long branch score
- Monophyly check
- Multi-regime OU models (OUwie)
- Nearest neighbor interchange
- Network signal
- OU shift detection (l1ou)
- Patristic distances
- Phenogram (traitgram)
- Phylogenetic GLM
- Phylogenetic Ordination
- Phylogenetic regression (PGLS)
- Phylogenetic signal
- Phylomorphospace
- Polytomy testing
- Print tree
- Prune tree
- Quartet network
- Rate heterogeneity test (multi-rate Brownian motion)
- Rename tree tips
- Robinson-Foulds distance
- Root tree
- Spurious homolog identification
- Stochastic character mapping (SIMMAP)
- Terminal branch statistics
- Threshold model
- Tip labels
- Tip-to-tip distance
- Tip-to-tip node distance
- Total tree length
- Treeness
- Alignment- and tree-based functions
- Tutorials
- 1. Summarizing information content
- 2. Evaluating gene-gene covariation
- 3. Identifying signatures of rapid radiations
- 4. Evaluating the accuracy of a multiple sequence alignment
- 5. Mapping the evolutionary history of discrete traits
- 6. Testing for phylogenetic signal in continuous traits
- 7. Phylogenetic ordination for multivariate trait analysis
- 8. Visualizing trait evolution with phylomorphospace
- 9. Phylogenetic regression (PGLS)
- 10. Phylogenetic GLM for binary and count data
- 11. Reconstructing ancestral trait values and mapping them onto a phylogeny
- Step 0: Prepare data
- Step 1: Run fast ancestral reconstruction with confidence intervals
- Step 2: Use the VCV-based ML method
- Step 3: Generate a contMap plot
- Step 4: Use a multi-trait file
- Step 5: Export results as JSON
- Step 6: Reconstruct discrete traits
- Step 7: Choose a discrete model
- Step 8: Plot discrete ancestral states
- Summary
- 12. Testing for rate heterogeneity across phylogenetic regimes
- 13. Visualization commands
- 14. Comparing continuous trait evolution models
- 15. Multi-regime OU models (OUwie)
- 16. Automatic detection of adaptive shifts on a phylogeny
- 17. Visualizing conflicting phylogenetic signal with splits networks
- 18. End-to-end comparative methods workflow
- 19. Gene tree discordance analysis pipeline
- Change log
- Other software
- FAQ