On Random Tree Structures, Their Entropy, and Compression
Measuring the complexity of tree structures can be beneficial in areas that use tree data structures for storage, communication, and processing purposes. This complexity can then be used to compress tree data structures to their information-theoretic limit. Additionally, the lack of models for random generation of trees is very much felt in mathematical modeling of trees and graphs. In this paper, a number of existing tree generation models such as simply generated trees are discussed, and their information content is analysed by means of information theory and Shannon's entropy. Subsequently, a new model for generating trees based on practical appearances of trees is introduced, and an upper bound for its entropy is calculated. This model is based on selecting a random tree from possible spanning trees of graphs, which is what happens often in practice. Moving on to tree compression, we find approaches to universal tree compression of the discussed models. These approaches first transform a tree into a sequence of symbols, and then apply a dictionary-based compression method. Conditions for the universality of these method are then studied and analysed.
READ FULL TEXT