Log In Sign Up

Estimating the Number of Components in Finite Mixture Models via the Group-Sort-Fuse Procedure

by   Tudor Manole, et al.

Estimation of the number of components (or order) of a finite mixture model is a long standing and challenging problem in statistics. We propose the Group-Sort-Fuse (GSF) procedure—a new penalized likelihood approach for simultaneous estimation of the order and mixing measure in multidimensional finite mixture models. Unlike methods which fit and compare mixtures with varying orders using criteria involving model complexity, our approach directly penalizes a continuous function of the model parameters. More specifically, given a conservative upper bound on the order, the GSF groups and sorts mixture component parameters to fuse those which are redundant. For a wide range of finite mixture models, we show that the GSF is consistent in estimating the true mixture order and achieves the n^-1/2 convergence rate for parameter estimation up to polylogarithmic factors. The GSF is implemented for several univariate and multivariate mixture models in the R package GroupSortFuse. Its finite sample performance is supported by a thorough simulation study, and its application is illustrated on two real data examples.


page 1

page 2

page 3

page 4


Evidence estimation in finite and infinite mixture models and applications

Estimating the model evidence - or mariginal likelihood of the data - is...

On Coarse Graining of Information and Its Application to Pattern Recognition

We propose a method based on finite mixture models for classifying a set...

Estimating finite mixtures of semi-Markov chains: an application to the segmentation of temporal sensory data

In food science, it is of great interest to get information about the te...

Order selection with confidence for finite mixture models

The determination of the number of mixture components (the order) of a f...

Multivariate normal mixture modeling, clustering and classification with the rebmix package

The rebmix package provides R functions for random univariate and multiv...

Exact fit of simple finite mixture models

How to forecast next year's portfolio-wide credit default rate based on ...

Penalized Component Hub Models

Social network analysis presupposes that observed social behavior is inf...