Deep Continuous Clustering

03/05/2018
by   Sohil Atul Shah, et al.
0

Clustering high-dimensional datasets is hard because interpoint distances become less informative in high-dimensional spaces. We present a clustering algorithm that performs nonlinear dimensionality reduction and clustering jointly. The data is embedded into a lower-dimensional space by a deep autoencoder. The autoencoder is optimized as part of the clustering process. The resulting network produces clustered data. The presented approach does not rely on prior knowledge of the number of ground-truth clusters. Joint nonlinear dimensionality reduction and clustering are formulated as optimization of a global continuous objective. We thus avoid discrete reconfigurations of the objective that characterize prior clustering algorithms. Experiments on datasets from multiple domains demonstrate that the presented algorithm outperforms state-of-the-art clustering schemes, including recent methods that use deep networks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/25/2022

Laplacian-based Cluster-Contractive t-SNE for High Dimensional Data Visualization

Dimensionality reduction techniques aim at representing high-dimensional...
research
04/28/2022

Representative period selection for power system planning using autoencoder-based dimensionality reduction

Power sector capacity expansion models (CEMs) that are used for studying...
research
06/07/2019

Learning Clustered Representation for Complex Free Energy Landscapes

In this paper we first analyzed the inductive bias underlying the data s...
research
02/23/2018

AEkNN: An AutoEncoder kNN-based classifier with built-in dimensionality reduction

High dimensionality, i.e. data having a large number of variables, tends...
research
08/29/2016

Robust Discriminative Clustering with Sparse Regularizers

Clustering high-dimensional data often requires some form of dimensional...
research
12/27/2019

Nonlinear Markov Clustering by Minimum Curvilinear Sparse Similarity

The development of algorithms for unsupervised pattern recognition by no...
research
05/08/2018

Finding Frequent Entities in Continuous Data

In many applications that involve processing high-dimensional data, it i...

Please sign up or login with your details

Forgot password? Click here to reset