Statistical and Computational Guarantees for the Baum-Welch Algorithm

12/27/2015
by   Fanny Yang, et al.
0

The Hidden Markov Model (HMM) is one of the mainstays of statistical modeling of discrete time series, with applications including speech recognition, computational biology, computer vision and econometrics. Estimating an HMM from its observation process is often addressed via the Baum-Welch algorithm, which is known to be susceptible to local optima. In this paper, we first give a general characterization of the basin of attraction associated with any global optimum of the population likelihood. By exploiting this characterization, we provide non-asymptotic finite sample guarantees on the Baum-Welch updates, guaranteeing geometric convergence to a small ball of radius on the order of the minimax rate around a global optimum. As a concrete example, we prove a linear rate of convergence for a hidden Markov mixture of two isotropic Gaussians given a suitable mean separation and an initialization within a ball of large radius around (one of) the true parameters. To our knowledge, these are the first rigorous local convergence guarantees to global optima for the Baum-Welch algorithm in a setting where the likelihood function is nonconvex. We complement our theoretical results with thorough numerical simulations studying the convergence of the Baum-Welch algorithm and illustrating the accuracy of our predictions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/11/2018

Nonasymptotic control of the MLE for misspecified nonparametric hidden Markov models

We study the problem of estimating an unknown time process distribution ...
research
08/09/2014

Statistical guarantees for the EM algorithm: From population to sample-based analysis

We develop a general framework for proving rigorous guarantees on the pe...
research
02/18/2023

Maximum Entropy Estimator for Hidden Markov Models: Reduction to Dimension 2

In the paper, we introduce the maximum entropy estimator based on 2-dime...
research
03/14/2023

Kullback-Leibler Divergence and Akaike Information Criterion in General Hidden Markov Models

To characterize the Kullback-Leibler divergence and Fisher information i...
research
09/20/2021

Sharp global convergence guarantees for iterative nonconvex optimization: A Gaussian process perspective

We consider a general class of regression models with normally distribut...
research
05/29/2023

The minimax risk in testing the histogram of discrete distributions for uniformity under missing ball alternatives

We consider the problem of testing the fit of a discrete sample of items...
research
01/18/2015

Some Insights About the Small Ball Probability Factorization for Hilbert Random Elements

Asymptotic factorizations for the small-ball probability (SmBP) of a Hil...

Please sign up or login with your details

Forgot password? Click here to reset