On Learning Markov Chains

10/28/2018
by   Yi Hao, et al.
0

The problem of estimating an unknown discrete distribution from its samples is a fundamental tenet of statistical learning. Over the past decade, it attracted significant research effort and has been solved for a variety of divergence measures. Surprisingly, an equally important problem, estimating an unknown Markov chain from its samples, is still far from understood. We consider two problems related to the min-max risk (expected loss) of estimating an unknown k-state Markov chain from its n sequential samples: predicting the conditional distribution of the next sample with respect to the KL-divergence, and estimating the transition matrix with respect to a natural loss induced by KL or a more general f-divergence measure. For the first measure, we determine the min-max prediction risk to within a linear factor in the alphabet size, showing it is Ω(k n / n) and O(k^2 n / n). For the second, if the transition probabilities can be arbitrarily small, then only trivial uniform risk upper bounds can be derived. We therefore consider transition probabilities that are bounded away from zero, and resolve the problem for essentially all sufficiently smooth f-divergences, including KL-, L_2-, Chi-squared, Hellinger, and Alpha-divergences.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/02/2019

Minimum Power to Maintain a Nonequilibrium Distribution of a Markov Chain

Biological systems use energy to maintain non-equilibrium distributions ...
research
04/02/2020

Robust Parametric Inference for Finite Markov Chains

We consider the problem of statistical inference in a parametric finite ...
research
04/03/2018

Estimation of Markov Chain via Rank-constrained Likelihood

This paper studies the recovery and state compression of low-rank Markov...
research
05/30/2019

Convergence of Smoothed Empirical Measures with Applications to Entropy Estimation

This paper studies convergence of empirical measures smoothed by a Gauss...
research
12/14/2019

Mixing Time Estimation in Ergodic Markov Chains from a Single Trajectory with Contraction Methods

The mixing time t_mix of an ergodic Markov chain measures the rate of co...
research
06/29/2013

Concentration and Confidence for Discrete Bayesian Sequence Predictors

Bayesian sequence prediction is a simple technique for predicting future...
research
04/21/2020

An Information-Theoretic Proof of the Streaming Switching Lemma for Symmetric Encryption

Motivated by a fundamental paradigm in cryptography, we consider a recen...

Please sign up or login with your details

Forgot password? Click here to reset