Addressing overfitting in spectral clustering via a non-parametric bootstrap

09/13/2022
by   Liam Welsh, et al.
0

Finite mixture modelling is a popular method in the field of clustering and is beneficial largely due to its soft cluster membership probabilities. However, the most common algorithm for fitting finite mixture models, the EM algorithm, falls victim to a number of issues. We address these issues that plague clustering using finite mixture models, including convergence to solutions corresponding to local maxima and algorithm speed concerns in high dimensional cases. This is done by developing two novel algorithms that incorporate a spectral decomposition of the data matrix and a non-parametric bootstrap sampling scheme. Simulations show the validity of our algorithms and demonstrate not only their flexibility but also their ability to avoid solutions corresponding to local-maxima, when compared to other (bootstrapped) clustering algorithms for estimating finite mixture models. Our novel algorithms have a typically more consistent convergence criteria as well as a significant increase in speed over other bootstrapped algorithms that fit finite mixture models.

READ FULL TEXT
research
10/18/2021

Recovery Guarantees for Kernel-based Clustering under Non-parametric Mixture Models

Despite the ubiquity of kernel-based clustering, surprisingly few statis...
research
11/12/2017

Parameter Estimation in Finite Mixture Models by Regularized Optimal Transport: A Unified Framework for Hard and Soft Clustering

In this short paper, we formulate parameter estimation for finite mixtur...
research
09/19/2022

SMIXS: Novel efficient algorithm for non-parametric mixture regression-based clustering

We investigate a novel non-parametric regression-based clustering algori...
research
12/10/2021

Full Model Estimation for Non-Parametric Multivariate Finite Mixture Models

This paper addresses the problem of full model estimation for non-parame...
research
05/19/2018

Estimation of Non-Normalized Mixture Models and Clustering Using Deep Representation

We develop a general method for estimating a finite mixture of non-norma...
research
02/05/2023

Regularization and Global Optimization in Model-Based Clustering

Due to their conceptual simplicity, k-means algorithm variants have been...
research
05/16/2022

A bootstrap approach for validating the number of groups identified by latent class growth models

The use of longitudinal finite mixture models such as group-based trajec...

Please sign up or login with your details

Forgot password? Click here to reset