Bounds for Algorithmic Mutual Information and a Unifilar Order Estimator

by   Łukasz Dębowski, et al.

Inspired by Hilberg's hypothesis, which states that mutual information between blocks for natural language grows like a power law, we seek for links between power-law growth rate of algorithmic mutual information and of some estimator of the unifilar order, i.e., the number of hidden states in the generating stationary ergodic source in its minimal unifilar hidden Markov representation. We consider an order estimator which returns the smallest order for which the maximum likelihood is larger than a weakly penalized universal probability. This order estimator is intractable and follows the ideas by Merhav, Gutman, and Ziv (1989) and by Ziv and Merhav (1992) but in its exact form seems overlooked despite some nice theoretical properties. In particular, we can prove both strong consistency of this order estimator and an upper bound of algorithmic mutual information in terms of it. Using both results, we show that all (also uncomputable) sources of a finite unifilar order exhibit sub-power-law growth of algorithmic mutual information and of the unifilar order estimator. In contrast, we also exhibit an example of unifilar processes of a countably infinite order, with a deterministic pushdown automaton and an algorithmically random oracle, for which the mentioned two quantities grow as a power law with the same exponent. We also relate our results to natural language research.



There are no comments yet.


page 1

page 2

page 3

page 4


Strong Asymptotic Composition Theorems for Sibson Mutual Information

We characterize the growth of the Sibson mutual information, of any orde...

Tensor networks and efficient descriptions of classical data

We investigate the potential of tensor network based machine learning me...

Kolmogorov's Algorithmic Mutual Information Is Equivalent to Bayes' Law

Given two events A and B, Bayes' law is based on the argument that the p...

Is Natural Language a Perigraphic Process? The Theorem about Facts and Words Revisited

As we discuss, a stationary stochastic process is nonergodic when a rand...

On a Class of Markov Order Estimators Based on PPM and Other Universal Codes

We investigate a class of estimators of the Markov order for stationary ...

Optimized Bacteria are Environmental Prediction Engines

Experimentalists have observed phenotypic variability in isogenic bacter...

Mutual Information Scaling and Expressive Power of Sequence Models

Sequence models assign probabilities to variable-length sequences such a...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.