An Adaptive EM Accelerator for Unsupervised Learning of Gaussian Mixture Models

09/26/2020
by   Truong Nguyen, et al.
4

We propose an Anderson Acceleration (AA) scheme for the adaptive Expectation-Maximization (EM) algorithm for unsupervised learning a finite mixture model from multivariate data (Figueiredo and Jain 2002). The proposed algorithm is able to determine the optimal number of mixture components autonomously, and converges to the optimal solution much faster than its non-accelerated version. The success of the AA-based algorithm stems from several developments rather than a single breakthrough (and without these, our tests demonstrate that AA fails catastrophically). To begin, we ensure the monotonicity of the likelihood function (a the key feature of the standard EM algorithm) with a recently proposed monotonicity-control algorithm (Henderson and Varahdan 2019), enhanced by a novel monotonicity test with little overhead. We propose nimble strategies for AA to preserve the positive definiteness of the Gaussian weights and covariance matrices strictly, and to conserve up to the second moments of the observed data set exactly. Finally, we employ a K-means clustering algorithm using the gap statistic to avoid excessively overestimating the initial number of components, thereby maximizing performance. We demonstrate the accuracy and efficiency of the algorithm with several synthetic data sets that are mixtures of Gaussians distributions of known number of components, as well as data sets generated from particle-in-cell simulations. Our numerical results demonstrate speed-ups with respect to non-accelerated EM of up to 60X when the exact number of mixture components is known, and between a few and more than an order of magnitude with component adaptivity.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/25/2013

Robust EM algorithm for model-based curve clustering

Model-based clustering approaches concern the paradigm of exploratory da...
research
09/24/2014

Unsupervised learning of regression mixture models with unknown number of components

Regression mixture models are widely studied in statistics, machine lear...
research
06/27/2012

Convergence of the EM Algorithm for Gaussian Mixtures with Unbalanced Mixing Coefficients

The speed of convergence of the Expectation Maximization (EM) algorithm ...
research
09/29/2022

Likelihood adjusted semidefinite programs for clustering heterogeneous data

Clustering is a widely deployed unsupervised learning tool. Model-based ...
research
10/01/2018

Accelerated Training of Large-Scale Gaussian Mixtures by a Merger of Sublinear Approaches

We combine two recent lines of research on sublinear clustering to signi...
research
05/31/2022

Improvements to Supervised EM Learning of Shared Kernel Models by Feature Space Partitioning

Expectation maximisation (EM) is usually thought of as an unsupervised l...
research
07/28/2017

An Open Source C++ Implementation of Multi-Threaded Gaussian Mixture Models, k-Means and Expectation Maximisation

Modelling of multivariate densities is a core component in many signal p...

Please sign up or login with your details

Forgot password? Click here to reset