Self-Representation Based Unsupervised Exemplar Selection in a Union of Subspaces

06/07/2020
by   Chong You, et al.
0

Finding a small set of representatives from an unlabeled dataset is a core problem in a broad range of applications such as dataset summarization and information extraction. Classical exemplar selection methods such as k-medoids work under the assumption that the data points are close to a few cluster centroids, and cannot handle the case where data lie close to a union of subspaces. This paper proposes a new exemplar selection model that searches for a subset that best reconstructs all data points as measured by the ℓ_1 norm of the representation coefficients. Geometrically, this subset best covers all the data points as measured by the Minkowski functional of the subset. To solve our model efficiently, we introduce a farthest first search algorithm that iteratively selects the worst represented point as an exemplar. When the dataset is drawn from a union of independent subspaces, our method is able to select sufficiently many representatives from each subspace. We further develop an exemplar based subspace clustering method that is robust to imbalanced data and efficient for large scale data. Moreover, we show that a classifier trained on the selected exemplars (when they are labeled) can correctly classify the rest of the data points.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/23/2014

Approximate Subspace-Sparse Recovery with Corrupted Data via Constrained ℓ_1-Minimization

High-dimensional data often lie in low-dimensional subspaces correspondi...
research
07/25/2019

Theory of Spectral Method for Union of Subspaces-Based Random Geometry Graph

Spectral Method is a commonly used scheme to cluster data points lying c...
research
10/15/2015

Group-Invariant Subspace Clustering

In this paper we consider the problem of group invariant subspace cluste...
research
04/09/2020

Learnable Subspace Clustering

This paper studies the large-scale subspace clustering (LSSC) problem wi...
research
01/27/2022

Reduction of Two-Dimensional Data for Speeding Up Convex Hull Computation

An incremental approach for computation of convex hull for data points i...
research
04/25/2018

RULLS: Randomized Union of Locally Linear Subspaces for Feature Engineering

Feature engineering plays an important role in the success of a machine ...
research
05/07/2014

Representative Selection for Big Data via Sparse Graph and Geodesic Grassmann Manifold Distance

This paper addresses the problem of identifying a very small subset of d...

Please sign up or login with your details

Forgot password? Click here to reset