Gradient-based training of Gaussian Mixture Models in High-Dimensional Spaces

12/18/2019
by   Alexander Gepperth, et al.
55

We present an approach for efficiently training Gaussian Mixture Models (GMMs) with Stochastic Gradient Descent (SGD) on large amounts of high-dimensional data (e.g., images). In such a scenario, SGD is strongly superior in terms of execution time and memory usage, although it is conceptually more complex than the traditional Expectation-Maximization (EM) algorithm. For enabling SGD training, we propose three novel ideas: First, we show that minimizing an upper bound to the GMM log likelihood instead of the full one is feasible and numerically much more stable way in high-dimensional spaces. Secondly, we propose a new annealing procedure that prevents SGD from converging to pathological local minima. We also propose an SGD-compatible simplification to the full GMM model based on local principal directions, which avoids excessive memory use in high-dimensional spaces due to quadratic growth of covariance matrices. Experiments on several standard image datasets show the validity of our approach, and we provide a publicly available TensorFlow implementation.

READ FULL TEXT

page 6

page 7

page 11

page 12

research
08/26/2023

Large-scale gradient-based training of Mixtures of Factor Analyzers

Gaussian Mixture Models (GMMs) are a standard tool in data analysis. How...
research
09/24/2020

A Rigorous Link Between Self-Organizing Maps and Gaussian Mixture Models

This work presents a mathematical treatment of the relation between Self...
research
11/15/2017

Sliced Wasserstein Distance for Learning Gaussian Mixture Models

Gaussian mixture models (GMM) are powerful parametric tools with many ap...
research
08/23/2022

Multinomial Cluster-Weighted Models for High-Dimensional Data

Modeling of high-dimensional data is very important to categorize differ...
research
06/12/2023

Deep Gaussian Mixture Ensembles

This work introduces a novel probabilistic deep learning technique calle...
research
09/29/2022

Likelihood adjusted semidefinite programs for clustering heterogeneous data

Clustering is a widely deployed unsupervised learning tool. Model-based ...
research
03/23/2020

A classification for the performance of online SGD for high-dimensional inference

Stochastic gradient descent (SGD) is a popular algorithm for optimizatio...

Please sign up or login with your details

Forgot password? Click here to reset