A Unified Framework for Representation-based Subspace Clustering of Out-of-sample and Large-scale Data

09/25/2013
by   Xi Peng, et al.
0

Under the framework of spectral clustering, the key of subspace clustering is building a similarity graph which describes the neighborhood relations among data points. Some recent works build the graph using sparse, low-rank, and ℓ_2-norm-based representation, and have achieved state-of-the-art performance. However, these methods have suffered from the following two limitations. First, the time complexities of these methods are at least proportional to the cube of the data size, which make those methods inefficient for solving large-scale problems. Second, they cannot cope with out-of-sample data that are not used to construct the similarity graph. To cluster each out-of-sample datum, the methods have to recalculate the similarity graph and the cluster membership of the whole data set. In this paper, we propose a unified framework which makes representation-based subspace clustering algorithms feasible to cluster both out-of-sample and large-scale data. Under our framework, the large-scale problem is tackled by converting it as out-of-sample problem in the manner of "sampling, clustering, coding, and classifying". Furthermore, we give an estimation for the error bounds by treating each subspace as a point in a hyperspace. Extensive experimental results on various benchmark data sets show that our methods outperform several recently-proposed scalable methods in clustering large-scale data set.

READ FULL TEXT
research
10/20/2019

Sparse-Dense Subspace Clustering

Subspace clustering refers to the problem of clustering high-dimensional...
research
09/28/2016

StruClus: Structural Clustering of Large-Scale Graph Databases

We present a structural clustering algorithm for large-scale datasets of...
research
04/24/2013

Locally linear representation for image clustering

It is a key to construct a similarity graph in graph-oriented subspace l...
research
07/02/2017

Classification non supervisée des données hétérogènes à large échelle

When it comes to cluster massive data, response time, disk access and qu...
research
06/08/2021

Weighted Sparse Subspace Representation: A Unified Framework for Subspace Clustering, Constrained Clustering, and Active Learning

Spectral-based subspace clustering methods have proved successful in man...
research
02/14/2020

Clustering based on Point-Set Kernel

Measuring similarity between two objects is the core operation in existi...
research
09/05/2012

Constructing the L2-Graph for Robust Subspace Learning and Subspace Clustering

Under the framework of graph-based learning, the key to robust subspace ...

Please sign up or login with your details

Forgot password? Click here to reset