Appropriateness of Performance Indices for Imbalanced Data Classification: An Analysis

08/26/2020
by   Sankha Subhra Mullick, et al.
0

Indices quantifying the performance of classifiers under class-imbalance, often suffer from distortions depending on the constitution of the test set or the class-specific classification accuracy, creating difficulties in assessing the merit of the classifier. We identify two fundamental conditions that a performance index must satisfy to be respectively resilient to altering number of testing instances from each class and the number of classes in the test set. In light of these conditions, under the effect of class imbalance, we theoretically analyze four indices commonly used for evaluating binary classifiers and five popular indices for multi-class classifiers. For indices violating any of the conditions, we also suggest remedial modification and normalization. We further investigate the capability of the indices to retain information about the classification performance over all the classes, even when the classifier exhibits extreme performance on some classes. Simulation studies are performed on high dimensional deep representations of subset of the ImageNet dataset using four state-of-the-art classifiers tailored for handling class imbalance. Finally, based on our theoretical findings and empirical evidence, we recommend the appropriate indices that should be used to evaluate the performance of classifiers in presence of class-imbalance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/09/2019

Classification of Imbalanced Data with a Geometric Digraph Family

We use a geometric digraph family called class cover catch digraphs (CCC...
research
01/15/2020

On Model Evaluation under Non-constant Class Imbalance

Many real-world classification problems are significantly class-imbalanc...
research
06/16/2016

How many faces can be recognized? Performance extrapolation for multi-class classification

The difficulty of multi-class classification generally increases with th...
research
03/12/2020

Modification Indices for Diagnostic Classification Models

Diagnostic classification models (DCMs) are psychometric models for eval...
research
09/16/2011

A Characterization of the Combined Effects of Overlap and Imbalance on the SVM Classifier

In this paper we demonstrate that two common problems in Machine Learnin...
research
08/31/2016

Towards Competitive Classifiers for Unbalanced Classification Problems: A Study on the Performance Scores

Although a great methodological effort has been invested in proposing co...
research
05/26/2021

Compensating class imbalance for acoustic chimpanzee detection with convolutional recurrent neural networks

Automatic detection systems are important in passive acoustic monitoring...

Please sign up or login with your details

Forgot password? Click here to reset