Challenges with EM in application to weakly identifiable mixture models

02/01/2019
by   Raaz Dwivedi, et al.
0

We study a class of weakly identifiable location-scale mixture models for which the maximum likelihood estimates based on n i.i.d. samples are known to have lower accuracy than the classical n^- 1/2 error. We investigate whether the Expectation-Maximization (EM) algorithm also converges slowly for these models. We first demonstrate via simulation studies a broad range of over-specified mixture models for which the EM algorithm converges very slowly, both in one and higher dimensions. We provide a complete analytical characterization of this behavior for fitting data generated from a multivariate standard normal distribution using two-component Gaussian mixture with varying location and scale parameters. Our results reveal distinct regimes in the convergence behavior of EM as a function of the dimension d. In the multivariate setting (d ≥ 2), when the covariance matrix is constrained to a multiple of the identity matrix, the EM algorithm converges in order (n/d)^1/2 steps and returns estimates that are at a Euclidean distance of order (n/d)^-1/4 and (n d)^- 1/2 from the true location and scale parameter respectively. On the other hand, in the univariate setting (d = 1), the EM algorithm converges in order n^3/4 steps and returns estimates that are at a Euclidean distance of order n^- 1/8 and n^-1/4 from the true location and scale parameter respectively. Establishing the slow rates in the univariate setting requires a novel localization argument with two stages, with each stage involving an epoch-based argument applied to a different surrogate EM operator at the population level. We also show multivariate (d ≥ 2) examples, involving more general covariance matrices, that exhibit the same slow rates as the univariate case.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset