AdaCC: Cumulative Cost-Sensitive Boosting for Imbalanced Classification

09/17/2022
by   Vasileios Iosifidis, et al.
0

Class imbalance poses a major challenge for machine learning as most supervised learning models might exhibit bias towards the majority class and under-perform in the minority class. Cost-sensitive learning tackles this problem by treating the classes differently, formulated typically via a user-defined fixed misclassification cost matrix provided as input to the learner. Such parameter tuning is a challenging task that requires domain knowledge and moreover, wrong adjustments might lead to overall predictive performance deterioration. In this work, we propose a novel cost-sensitive boosting approach for imbalanced data that dynamically adjusts the misclassification costs over the boosting rounds in response to model's performance instead of using a fixed misclassification cost matrix. Our method, called AdaCC, is parameter-free as it relies on the cumulative behavior of the boosting model in order to adjust the misclassification costs for the next boosting round and comes with theoretical guarantees regarding the training error. Experiments on 27 real-world datasets from different domains with high class imbalance demonstrate the superiority of our method over 12 state-of-the-art cost-sensitive boosting approaches exhibiting consistent improvements in different measures, for instance, in the range of [0.3 for AUC, [3.4 [7.4

READ FULL TEXT

page 16

page 17

page 24

research
12/18/2017

MEBoost: Mixing Estimators with Boosting for Imbalanced Data Classification

Class imbalance problem has been a challenging research problem in the f...
research
11/15/2017

LIUBoost : Locality Informed Underboosting for Imbalanced Data Classification

The problem of class imbalance along with class-overlapping has become a...
research
04/28/2018

A Cost-Sensitive Deep Belief Network for Imbalanced Classification

Imbalanced data with a skewed class distribution are common in many real...
research
09/17/2019

AdaFair: Cumulative Fairness Adaptive Boosting

The widespread use of ML-based decision making in domains with high soci...
research
03/31/2020

Deep Learning based Frameworks for Handling Imbalance in DGA, Email, and URL Data Analysis

Deep learning is a state of the art method for a lot of applications. Th...
research
03/29/2023

GAT-COBO: Cost-Sensitive Graph Neural Network for Telecom Fraud Detection

Along with the rapid evolution of mobile communication technologies, suc...
research
07/23/2022

Density-Aware Personalized Training for Risk Prediction in Imbalanced Medical Data

Medical events of interest, such as mortality, often happen at a low rat...

Please sign up or login with your details

Forgot password? Click here to reset