Recovery Guarantees for Kernel-based Clustering under Non-parametric Mixture Models

10/18/2021
by   Leena Chennuru Vankadara, et al.
0

Despite the ubiquity of kernel-based clustering, surprisingly few statistical guarantees exist beyond settings that consider strong structural assumptions on the data generation process. In this work, we take a step towards bridging this gap by studying the statistical performance of kernel-based clustering algorithms under non-parametric mixture models. We provide necessary and sufficient separability conditions under which these algorithms can consistently recover the underlying true clustering. Our analysis provides guarantees for kernel clustering approaches without structural assumptions on the form of the component distributions. Additionally, we establish a key equivalence between kernel-based data-clustering and kernel density-based clustering. This enables us to provide consistency guarantees for kernel-based estimators of non-parametric mixture models. Along with theoretical implications, this connection could have practical implications, including in the systematic choice of the bandwidth of the Gaussian kernel in the context of clustering.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/17/2017

Adaptive Clustering Using Kernel Density Estimators

We investigate statistical properties of a clustering algorithm that rec...
research
09/13/2022

Addressing overfitting in spectral clustering via a non-parametric bootstrap

Finite mixture modelling is a popular method in the field of clustering ...
research
02/12/2018

Identifiability of Nonparametric Mixture Models and Bayes Optimal Clustering

Motivated by problems in data clustering, we establish general condition...
research
09/16/2020

Clustering Data with Nonignorable Missingness using Semi-Parametric Mixture Models

We are concerned in clustering continuous data sets subject to nonignora...
research
09/19/2022

SMIXS: Novel efficient algorithm for non-parametric mixture regression-based clustering

We investigate a novel non-parametric regression-based clustering algori...
research
10/26/2017

Energy Clustering

Energy statistics was proposed by Székely in the 80's inspired by the Ne...
research
03/13/2020

When are Non-Parametric Methods Robust?

A growing body of research has shown that many classifiers are susceptib...

Please sign up or login with your details

Forgot password? Click here to reset