CAC: A Clustering Based Framework for Classification

02/23/2021
by   shivin-srivastava, et al.
0

In data containing heterogeneous subpopulations, classification performance benefits from incorporating the knowledge of cluster structure in the classifier. Previous methods for such combined clustering and classification either are classifier-specific and not generic or independently perform clustering and classifier training, which may not form clusters that can potentially benefit classifier performance. The question of how to perform clustering to improve the performance of classifiers trained on the clusters has received scant attention in previous literature despite its importance in several real-world applications. In this paper, we theoretically analyze when and how clustering may help in obtaining accurate classifiers. We design a simple, efficient, and generic framework called Classification Aware Clustering (CAC), to find clusters that are well suited for being used as training datasets by classifiers for each underlying subpopulation. Our experiments on synthetic and real benchmark datasets demonstrate the efficacy of CAC over previous methods for combined clustering and classification.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/17/2022

ExpertNet: A Symbiosis of Classification and Clustering

A widely used paradigm to improve the generalization performance of high...
research
12/29/2022

On Learning the Structure of Clusters in Graphs

Graph clustering is a fundamental problem in unsupervised learning, with...
research
09/06/2011

An Automatic Clustering Technique for Optimal Clusters

This paper proposes a simple, automatic and efficient clustering algorit...
research
05/02/2019

Clustering Images by Unmasking - A New Baseline

We propose a novel agglomerative clustering method based on unmasking, a...
research
05/28/2018

Hierarchical clustering with deep Q-learning

The reconstruction and analyzation of high energy particle physics data ...
research
07/26/2022

Task Agnostic and Post-hoc Unseen Distribution Detection

Despite the recent advances in out-of-distribution(OOD) detection, anoma...
research
06/03/2020

Tangles: a new paradigm for clusters and types

Traditional clustering identifies groups of objects that share certain q...

Please sign up or login with your details

Forgot password? Click here to reset