AutoGMM: Automatic Gaussian Mixture Modeling in Python

09/06/2019
by   Thomas L. Athey, et al.
0

Gaussian mixture modeling is a fundamental tool in clustering, as well as discriminant analysis and semiparametric density estimation. However, estimating the optimal model for any given number of components is an NP-hard problem, and estimating the number of components is in some respects an even harder problem. In R, a popular package called mclust addresses both of these problems. However, Python has lacked such a package. We therefore introduce AutoGMM, a Python algorithm for automatic Gaussian mixture modeling. AutoGMM builds upon scikit-learn's AgglomerativeClustering and GaussianMixture classes, with certain modifications to make the results more stable. Empirically, on several different applications, AutoGMM performs approximately as well as mclust. This algorithm is freely available and therefore further shrinks the gap between functionality of R and Python for data science.

READ FULL TEXT
research
06/09/2021

Gaussian Mixture Estimation from Weighted Samples

We consider estimating the parameters of a Gaussian mixture density with...
research
06/29/2021

Estimating Gaussian mixtures using sparse polynomial moment systems

The method of moments is a statistical technique for density estimation ...
research
11/02/2019

salmon: A Symbolic Linear Regression Package for Python

One of the most attractive features of R is its linear modeling capabili...
research
07/09/2020

Structural Gaussian mixture vector autoregressive model

A structural version of the Gaussian mixture vector autoregressive model...
research
05/04/2022

pyRDF2Vec: A Python Implementation and Extension of RDF2Vec

This paper introduces pyRDF2Vec, a Python software package that reimplem...
research
02/19/2020

A Unified Framework for Gaussian Mixture Reduction with Composite Transportation Distance

Gaussian mixture reduction (GMR) is the problem of approximating a finit...
research
10/03/2019

On some spectral properties of stochastic similarity matrices for data clustering

Clustering in image analysis is a central technique that allows to class...

Please sign up or login with your details

Forgot password? Click here to reset