EvoZip: Efficient Compression of Large Collections of Evolutionary Trees

10/17/2019
by   Balanand Jha, et al.
0

Phylogenetic trees represent evolutionary relationships among sets of organisms. Popular phylogenetic reconstruction approaches typically yield hundreds to thousands of trees on a common leafset. Storing and sharing such large collection of trees requires considerable amount of space and bandwidth. Furthermore, the huge size of phylogenetic tree databases can make search and retrieval operations time-consuming. Phylogenetic compression techniques are specialized compression techniques that exploit redundant topological information to achieve better compression of phylogenetic trees. Here, we present EvoZip, a new approach for phylogenetic tree compression. On average, EvoZip achieves 71.6 and 60.47 algorithm for phylogenetic tree compression. While EvoZip is based on TreeZip, it betters TreeZip due to (a) an improved bipartition and support list encoding scheme, (b) use of Deflate compression algorithm, and (c) use of an efficient tree reconstruction algorithm. EvoZip is freely available online for use by the scientific community.

READ FULL TEXT
research
09/18/2023

On Random Tree Structures, Their Entropy, and Compression

Measuring the complexity of tree structures can be beneficial in areas t...
research
02/15/2018

Grammar-based Compression of Unranked Trees

We introduce forest straight-line programs (FSLPs) as a compressed repre...
research
03/05/2020

Order-Preserving Key Compression for In-Memory Search Trees

We present the High-speed Order-Preserving Encoder (HOPE) for in-memory ...
research
03/04/2018

Two-Dimensional Block Trees

The Block Tree (BT) is a novel compact data structure designed to compre...
research
01/15/2014

The Ultrametric Constraint and its Application to Phylogenetics

A phylogenetic tree shows the evolutionary relationships among species. ...
research
10/26/2018

Lossless (and Lossy) Compression of Random Forests

Ensemble methods are among the state-of-the-art predictive modeling appr...
research
05/15/2018

RLFC: Random Access Light Field Compression using Key Views and Bounded Integer Encoding

We present a new hierarchical compression scheme for encoding light fiel...

Please sign up or login with your details

Forgot password? Click here to reset