Robust EM algorithm for model-based curve clustering

12/25/2013
by   Faicel Chamroukhi, et al.
0

Model-based clustering approaches concern the paradigm of exploratory data analysis relying on the finite mixture model to automatically find a latent structure governing observed data. They are one of the most popular and successful approaches in cluster analysis. The mixture density estimation is generally performed by maximizing the observed-data log-likelihood by using the expectation-maximization (EM) algorithm. However, it is well-known that the EM algorithm initialization is crucial. In addition, the standard EM algorithm requires the number of clusters to be known a priori. Some solutions have been provided in [31, 12] for model-based clustering with Gaussian mixture models for multivariate data. In this paper we focus on model-based curve clustering approaches, when the data are curves rather than vectorial data, based on regression mixtures. We propose a new robust EM algorithm for clustering curves. We extend the model-based clustering approach presented in [31] for Gaussian mixture models, to the case of curve clustering by regression mixtures, including polynomial regression mixtures as well as spline or B-spline regressions mixtures. Our approach both handles the problem of initialization and the one of choosing the optimal number of clusters as the EM learning proceeds, rather than in a two-fold scheme. This is achieved by optimizing a penalized log-likelihood criterion. A simulation study confirms the potential benefit of the proposed algorithm in terms of robustness regarding initialization and funding the actual number of clusters.

READ FULL TEXT
research
09/24/2014

Unsupervised learning of regression mixture models with unknown number of components

Regression mixture models are widely studied in statistics, machine lear...
research
09/26/2020

An Adaptive EM Accelerator for Unsupervised Learning of Gaussian Mixture Models

We propose an Anderson Acceleration (AA) scheme for the adaptive Expecta...
research
08/04/2015

Bayesian mixtures of spatial spline regressions

This work relates the framework of model-based clustering for spatial fu...
research
11/15/2022

A robust model-based clustering based on the geometric median and the Median Covariation Matrix

Grouping observations into homogeneous groups is a recurrent task in sta...
research
06/09/2015

Stagewise Learning for Sparse Clustering of Discretely-Valued Data

The performance of EM in learning mixtures of product distributions ofte...
research
02/06/2015

Active Function Cross-Entropy Clustering

Gaussian Mixture Models (GMM) have found many applications in density es...
research
09/29/2022

Likelihood adjusted semidefinite programs for clustering heterogeneous data

Clustering is a widely deployed unsupervised learning tool. Model-based ...

Please sign up or login with your details

Forgot password? Click here to reset