EM Converges for a Mixture of Many Linear Regressions
We study the convergence of the Expectation-Maximization (EM) algorithm for mixtures of linear regressions with an arbitrary number k of components. We show that as long as signal-to-noise ratio (SNR) is more than Õ(k^2), well-initialized EM converges to the true regression parameters. Previous results for k ≥ 3 have only established local convergence for the noiseless setting, i.e., where SNR is infinitely large. Our results establish a near optimal statistical error rate of Õ(σ√(k^2 d/n)) for (sample-splitting) finite-sample EM with k components, where d is dimension, n is the number of samples, and σ is the variance of noise. In particular, our results imply exact recovery as σ→ 0, in contrast to most previous local convergence results for EM, where the statistical error scaled with the norm of parameters. Standard moment-method approaches suffice to guarantee we are in the region where our local convergence guarantees apply.
READ FULL TEXT