Imbalanced Class Data Performance Evaluation and Improvement using Novel Generative Adversarial Network-based Approach: SSG and GBO

10/23/2022
by   Md Manjurul Ahsan, et al.
0

Class imbalance in a dataset is one of the major challenges that can significantly impact the performance of machine learning models resulting in biased predictions. Numerous techniques have been proposed to address class imbalanced problems, including, but not limited to, Oversampling, Undersampling, and cost-sensitive approaches. Due to its ability to generate synthetic data, oversampling techniques such as the Synthetic Minority Oversampling Technique (SMOTE) is among the most widely used methodology by researchers. However, one of SMOTE's potential disadvantages is that newly created minor samples may overlap with major samples. As an effect, the probability of ML models' biased performance towards major classes increases. Recently, generative adversarial network (GAN) has garnered much attention due to its ability to create almost real samples. However, GAN is hard to train even though it has much potential. This study proposes two novel techniques: GAN-based Oversampling (GBO) and Support Vector Machine-SMOTE-GAN (SSG) to overcome the limitations of the existing oversampling approaches. The preliminary computational result shows that SSG and GBO performed better on the expanded imbalanced eight benchmark datasets than the original SMOTE. The study also revealed that the minor sample generated by SSG demonstrates Gaussian distributions, which is often difficult to achieve using original SMOTE.

READ FULL TEXT
research
05/16/2023

BSGAN: A Novel Oversampling Technique for Imbalanced Pattern Recognitions

Class imbalanced problems (CIP) are one of the potential challenges in d...
research
08/06/2021

SMOTified-GAN for class imbalanced pattern classification problems

Class imbalance in a dataset is a major problem for classifiers that res...
research
01/27/2022

FinGAN: Generative Adversarial Network for Analytical Customer Relationship Management in Banking and Insurance

Churn prediction in credit cards, fraud detection in insurance, and loan...
research
09/28/2020

Balancing thermal comfort datasets: We GAN, but should we?

Thermal comfort assessment for the built environment has become more ava...
research
03/22/2022

Dazzle: Using Optimized Generative Adversarial Networks to Address Security Data Class Imbalance Issue

Background: Machine learning techniques have been widely used and demons...
research
11/25/2022

OOG- Optuna Optimized GAN Sampling Technique for Tabular Imbalanced Malware Data

Cyberspace occupies a large portion of people's life in the age of moder...
research
11/27/2017

Pulsar Candidate Identification with Artificial Intelligence Techniques

Discovering pulsars is a significant and meaningful research topic in th...

Please sign up or login with your details

Forgot password? Click here to reset