A fast solution to cluster genetic sequences

The fast BAPS algorithm rapidly identifies genomic clusters


Fastbaps is a fast solution to the genetic clustering problem. It rapidly identifies an approximate fit to a Dirichlet process mixture model (DPM) for clustering multilocus genotype data. Our efficient model-based clustering approach is able to cluster datasets 10–100 times larger including over 110,000 sequences of HIV-1 pol genes.

View on GitHub.