DeepDPM: Deep Clustering With an Unknown Number of Clusters

03/27/2022
by   Meitar Ronen, et al.
0

Deep Learning (DL) has shown great promise in the unsupervised task of clustering. That said, while in classical (i.e., non-deep) clustering the benefits of the nonparametric approach are well known, most deep-clustering methods are parametric: namely, they require a predefined and fixed number of clusters, denoted by K. When K is unknown, however, using model-selection criteria to choose its optimal value might become computationally expensive, especially in DL as the training process would have to be repeated numerous times. In this work, we bridge this gap by introducing an effective deep-clustering method that does not require knowing the value of K as it infers it during the learning. Using a split/merge framework, a dynamic architecture that adapts to the changing K, and a novel loss, our proposed method outperforms existing nonparametric methods (both classical and deep ones). While the very few existing deep nonparametric methods lack scalability, we demonstrate ours by being the first to report the performance of such a method on ImageNet. We also demonstrate the importance of inferring K by showing how methods that fix it deteriorate in performance when their assumed K value gets further from the ground-truth one, especially on imbalanced datasets. Our code is available at https://github.com/BGU-CS-VIL/DeepDPM.

READ FULL TEXT
research
07/29/2020

Neural Network-based Reconstruction in Compressed Sensing MRI Without Fully-sampled Training Data

Compressed Sensing MRI (CS-MRI) has shown promise in reconstructing unde...
research
05/29/2019

Towards better substitution-based word sense induction

Word sense induction (WSI) is the task of unsupervised clustering of wor...
research
04/28/2021

A Deep Learning Object Detection Method for an Efficient Clusters Initialization

Clustering is an unsupervised machine learning method grouping data samp...
research
12/25/2018

Parallel Clustering of Single Cell Transcriptomic Data with Split-Merge Sampling on Dirichlet Process Mixtures

Motivation: With the development of droplet based systems, massive singl...
research
05/23/2023

DIVA: A Dirichlet Process Based Incremental Deep Clustering Algorithm via Variational Auto-Encoder

Generative model-based deep clustering frameworks excel in classifying c...
research
07/06/2020

Progressive Cluster Purification for Unsupervised Feature Learning

In unsupervised feature learning, sample specificity based methods ignor...
research
05/16/2020

Simple, Scalable, and Stable Variational Deep Clustering

Deep clustering (DC) has become the state-of-the-art for unsupervised cl...

Please sign up or login with your details

Forgot password? Click here to reset