On a Class of Markov Order Estimators Based on PPM and Other Universal Codes
We investigate a class of estimators of the Markov order for stationary ergodic processes which form a slight modification of the construction by Merhav, Gutman, and Ziv (1989). Both kinds of estimators compare the estimate of the entropy rate given by a universal code with the empirical conditional entropy of a string and return the order for which the two quantities are approximately equal. However, our modification, which we call universal Markov orders, satisfies a few attractive properties, not shown by Merhav, Gutman, and Ziv (1989) for their original definition. Firstly, the universal Markov orders are almost surely consistent, without any restrictions. Secondly, they are upper bounded asymptotically by the logarithm of the string length divided by the entropy rate. Thirdly, if we choose the Prediction by Partial Matching (PPM) as the universal code then the number of distinct substrings of the length equal to the universal Markov order constitutes an upper bound for the block mutual information. Thus universal Markov orders can be also used indirectly for quantification of long memory for an ergodic process.
READ FULL TEXT