Adaptive Noisy Clustering

06/10/2013
by   Michael Chichignoud, et al.
0

The problem of adaptive noisy clustering is investigated. Given a set of noisy observations Z_i=X_i+ϵ_i, i=1,...,n, the goal is to design clusters associated with the law of X_i's, with unknown density f with respect to the Lebesgue measure. Since we observe a corrupted sample, a direct approach as the popular k-means is not suitable in this case. In this paper, we propose a noisy k-means minimization, which is based on the k-means loss function and a deconvolution estimator of the density f. In particular, this approach suffers from the dependence on a bandwidth involved in the deconvolution kernel. Fast rates of convergence for the excess risk are proposed for a particular choice of the bandwidth, which depends on the smoothness of the density f. Then, we turn out into the main issue of the paper: the data-driven choice of the bandwidth. We state an adaptive upper bound for a new selection rule, called ERC (Empirical Risk Comparison). This selection rule is based on the Lepski's principle, where empirical risks associated with different bandwidths are compared. Finally, we illustrate that this adaptive rule can be used in many statistical problems of M-estimation where the empirical risk depends on a nuisance parameter.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/03/2013

Anisotropic oracle inequalities in noisy quantization

The effect of errors in variables in quantization is investigated. We pr...
research
08/16/2023

Multiplicative deconvolution under unknown error distribution

We consider a multiplicative deconvolution problem, in which the density...
research
08/15/2013

The algorithm of noisy k-means

In this note, we introduce a new algorithm to deal with finite dimension...
research
10/11/2017

Cytometry inference through adaptive atomic deconvolution

In this paper we consider a statistical estimation problem known as atom...
research
10/16/2018

Density Deconvolution with Small Berkson Errors

The present paper studies density deconvolution in the presence of small...
research
09/30/2014

Fully adaptive density-based clustering

The clusters of a distribution are often defined by the connected compon...
research
12/22/2017

A Bidirectional Adaptive Bandwidth Mean Shift Strategy for Clustering

The bandwidth of a kernel function is a crucial parameter in the mean sh...

Please sign up or login with your details

Forgot password? Click here to reset