A geometric analysis of subspace clustering with outliers

12/19/2011
by   Mahdi Soltanolkotabi, et al.
0

This paper considers the problem of clustering a collection of unlabeled data points assumed to lie near a union of lower-dimensional planes. As is common in computer vision or unsupervised learning applications, we do not know in advance how many subspaces there are nor do we have any information about their dimensions. We develop a novel geometric analysis of an algorithm named sparse subspace clustering (SSC) [In IEEE Conference on Computer Vision and Pattern Recognition, 2009. CVPR 2009 (2009) 2790-2797. IEEE], which significantly broadens the range of problems where it is provably effective. For instance, we show that SSC can recover multiple subspaces, each of dimension comparable to the ambient dimension. We also prove that SSC can correctly cluster data points even when the subspaces of interest intersect. Further, we develop an extension of SSC that succeeds when the data set is corrupted with possibly overwhelmingly many outliers. Underlying our analysis are clear geometric insights, which may bear on other sparse recovery problems. A numerical study complements our theoretical analysis and demonstrates the effectiveness of these methods.

READ FULL TEXT

page 18

page 19

page 23

research
09/05/2013

Noisy Sparse Subspace Clustering

This paper considers the problem of subspace clustering under noise. Spe...
research
01/11/2013

Robust subspace clustering

Subspace clustering refers to the task of finding a multi-subspace repre...
research
03/15/2013

Subspace Clustering via Thresholding and Spectral Clustering

We consider the problem of clustering a set of high-dimensional data poi...
research
07/18/2013

Robust Subspace Clustering via Thresholding

The problem of clustering noisy and incompletely observed high-dimension...
research
08/28/2018

Probabilistic Sparse Subspace Clustering Using Delayed Association

Discovering and clustering subspaces in high-dimensional data is a funda...
research
01/01/2018

Theoretical Analysis of Sparse Subspace Clustering with Missing Entries

Sparse Subspace Clustering (SSC) is a popular unsupervised machine learn...
research
02/09/2010

Probabilistic Recovery of Multiple Subspaces in Point Clouds by Geometric lp Minimization

We assume data independently sampled from a mixture distribution on the ...

Please sign up or login with your details

Forgot password? Click here to reset