Parameter Estimation in Gaussian Mixture Models with Malicious Noise, without Balanced Mixing Coefficients
We consider the problem of estimating means of two Gaussians in a 2-Gaussian mixture, which is not balanced and is corrupted by noise of an arbitrary distribution. We present a robust algorithm to estimate the parameters, together with upper bounds on the numbers of samples required for the estimate to be correct, where the bounds are parametrised by the dimension, ratio of the mixing coefficients, a measure of the separation of the two Gaussians, related to Mahalanobis distance, and a condition number of the covariance matrix. In theory, this is the first sample-complexity result for imbalanced mixtures corrupted by adversarial noise. In practice, our algorithm outperforms the vanilla Expectation-Maximisation (EM) algorithm in terms of estimation error.
READ FULL TEXT