Selective inference for k-means clustering

03/29/2022
by   Yiqun T. Chen, et al.
9

We consider the problem of testing for a difference in means between clusters of observations identified via k-means clustering. In this setting, classical hypothesis tests lead to an inflated Type I error rate. To overcome this problem, we take a selective inference approach. We propose a finite-sample p-value that controls the selective Type I error for a test of the difference in means between a pair of clusters obtained using k-means clustering, and show that it can be efficiently computed. We apply our proposal in simulation, and on hand-written digits data and single-cell RNA-sequencing data.

READ FULL TEXT
research
12/05/2020

Selective Inference for Hierarchical Clustering

Testing for a difference in means between two groups is fundamental to a...
research
06/15/2021

Tree-Values: selective inference for regression trees

We consider conducting inference on the output of the Classification and...
research
09/21/2021

More powerful selective inference for the graph fused lasso

The graph fused lasso – which includes as a special case the one-dimensi...
research
03/14/2021

Quantifying uncertainty in spikes estimated from calcium imaging data

In recent years, a number of methods have been proposed to estimate the ...
research
01/30/2023

Selective inference for clustering with unknown variance

In many modern statistical problems, the limited available data must be ...
research
09/04/2023

Selective inference after convex clustering with ℓ_1 penalization

Classical inference methods notoriously fail when applied to data-driven...
research
09/30/2021

A flexible and robust non-parametric test of exchangeability

Many statistical analyses assume that the data points within a sample ar...

Please sign up or login with your details

Forgot password? Click here to reset