EC3: Combining Clustering and Classification for Ensemble Learning

08/29/2017
by   Tanmoy Chakraborty, et al.
0

Classification and clustering algorithms have been proved to be successful individually in different contexts. Both of them have their own advantages and limitations. For instance, although classification algorithms are more powerful than clustering methods in predicting class labels of objects, they do not perform well when there is a lack of sufficient manually labeled reliable data. On the other hand, although clustering algorithms do not produce label information for objects, they provide supplementary constraints (e.g., if two objects are clustered together, it is more likely that the same label is assigned to both of them) that one can leverage for label prediction of a set of unknown objects. Therefore, systematic utilization of both these types of algorithms together can lead to better prediction performance. In this paper, We propose a novel algorithm, called EC3 that merges classification and clustering together in order to support both binary and multi-class classification. EC3 is based on a principled combination of multiple classification and multiple clustering methods using an optimization function. We theoretically show the convexity and optimality of the problem and solve it by block coordinate descent method. We additionally propose iEC3, a variant of EC3 that handles imbalanced training data. We perform an extensive experimental analysis by comparing EC3 and iEC3 with 14 baseline methods (7 well-known standalone classifiers, 5 ensemble classifiers, and 2 existing methods that merge classification and clustering) on 13 standard benchmark datasets. We show that our methods outperform other baselines for every single dataset, achieving at most 10 the best baseline), more resilient to noise and class imbalance than the best baseline method.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/18/2021

Noise-Resilient Ensemble Learning using Evidence Accumulation Clustering

Ensemble Learning methods combine multiple algorithms performing the sam...
research
09/28/2022

Class-Imbalanced Complementary-Label Learning via Weighted Loss

Complementary-label learning (CLL) is a common application in the scenar...
research
02/01/2023

Learning from Stochastic Labels

Annotating multi-class instances is a crucial task in the field of machi...
research
09/25/2021

Integrating Unsupervised Clustering and Label-specific Oversampling to Tackle Imbalanced Multi-label Data

There is often a mixture of very frequent labels and very infrequent lab...
research
08/09/2019

Adaptive Ensemble of Classifiers with Regularization for Imbalanced Data Classification

Dynamic ensembling of classifiers is an effective approach in processing...
research
12/08/2017

Blind Multi-class Ensemble Learning with Unequally Reliable Classifiers

The rising interest in pattern recognition and data analytics has spurre...
research
12/17/2018

Multi Instance Learning For Unbalanced Data

In the context of Multi Instance Learning, we analyze the Single Instanc...

Please sign up or login with your details

Forgot password? Click here to reset