Library of efficient algorithms for phylogenetic analysis

12/23/2020 ∙ by Luana Silva, et al. ∙ 0

Evolutionary relationships between species are usually inferred through phylogenetic analysis, which provides phylogenetic trees computed from allelic profiles built by sequencing specific regions of the sequences and abstracting them to categorical indexes. With growing exchanges of people and merchandise, epidemics have become increasingly important, and combining information of country-specific datasets can now reveal unknown spreading patterns. The phylogenetic analysis workflow is mainly composed of four consecutive steps, the distance calculation, distance correction, inference algorithm, and local optimization steps. There are many phylogenetic tools out there, however most implement only some of these steps and serve only one single purpose, such as one type of algorithm. Another problem with these is that they are often hard to use and integrate, since each of them has its own API. This project resulted a library that implements the four steps of the workflow, and is highly performant, extensible, reusable, and portable, while providing common APIs and documentation. It also differs from other tools in the sense that, it is able to stop and resume the workflow whenever the user desires, and it was built to be continuously extended and not just serve a single purpose. The time benchmarks conducted on this library show that its implementations of the algorithms conform to their theoretical time complexity. Meanwhile, the memory benchmarks showcase that the implementations of the NJ algorithms follow a linear memory complexity, while the implementations of the MST and GCP algorithms follow a quadratic memory complexity.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.