Deep Over-sampling Framework for Classifying Imbalanced Data

04/25/2017
by   Shin Ando, et al.
0

Class imbalance is a challenging issue in practical classification problems for deep learning models as well as traditional models. Traditionally successful countermeasures such as synthetic over-sampling have had limited success with complex, structured data handled by deep learning models. In this paper, we propose Deep Over-sampling (DOS), a framework for extending the synthetic over-sampling method to exploit the deep feature space acquired by a convolutional neural network (CNN). Its key feature is an explicit, supervised representation learning, for which the training data presents each raw input sample with a synthetic embedding target in the deep feature space, which is sampled from the linear subspace of in-class neighbors. We implement an iterative process of training the CNN and updating the targets, which induces smaller in-class variance among the embeddings, to increase the discriminative power of the deep representation. We present an empirical study using public benchmarks, which shows that the DOS framework not only counteracts class imbalance better than the existing method, but also improves the performance of the CNN in the standard, balanced settings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/30/2018

Weak-supervision for Deep Representation Learning under Class Imbalance

Class imbalance is a pervasive issue among classification models includi...
research
12/09/2020

Removing Class Imbalance using Polarity-GAN: An Uncertainty Sampling Approach

Class imbalance is a challenging issue in practical classification probl...
research
07/13/2022

Efficient Augmentation for Imbalanced Deep Learning

Deep learning models memorize training data, which hurts their ability t...
research
10/09/2020

Handling Imbalanced Data: A Case Study for Binary Class Problems

For several years till date, the major issues in terms of solving for cl...
research
09/01/2019

An Efficient Convolutional Neural Network for Coronary Heart Disease Prediction

This study proposes an efficient neural network with convolutional layer...
research
01/03/2021

Synthetic Embedding-based Data Generation Methods for Student Performance

Given the inherent class imbalance issue within student performance data...
research
12/03/2020

ReMix: Calibrated Resampling for Class Imbalance in Deep learning

Class imbalance is a problem of significant importance in applied deep l...

Please sign up or login with your details

Forgot password? Click here to reset