Tackling Multilabel Imbalance through Label Decoupling and Data Resampling Hybridization

02/14/2018
by   Francisco Charte, et al.
0

The learning from imbalanced data is a deeply studied problem in standard classification and, in recent times, also in multilabel classification. A handful of multilabel resampling methods have been proposed in late years, aiming to balance the labels distribution. However these methods have to face a new obstacle, specific for multilabel data, as is the joint appearance of minority and majority labels in the same data patterns. We proposed recently a new algorithm designed to decouple imbalanced labels concurring in the same instance, called REMEDIAL (REsampling MultilabEl datasets by Decoupling highly ImbAlanced Labels). The goal of this work is to propose a procedure to hybridize this method with some of the best resampling algorithms available in the literature, including random oversampling, heuristic undersampling and synthetic sample generation techniques. These hybrid methods are then empirically analyzed, determining how their behavior is influenced by the label decoupling process. As a result, a noteworthy set of guidelines on the combined use of these techniques can be drawn from the conducted experimentation.

READ FULL TEXT
research
02/14/2018

Dealing with Difficult Minority Labels in Imbalanced Mutilabel Data Sets

Multilabel classification is an emergent data mining task with a broad r...
research
08/20/2022

A Novel Hybrid Sampling Framework for Imbalanced Learning

Class imbalance is a frequently occurring scenario in classification tas...
research
09/29/2020

Weakly Supervised-Based Oversampling for High Imbalance and High Dimensionality Data Classification

With the abundance of industrial datasets, imbalanced classification has...
research
05/30/2022

RankSim: Ranking Similarity Regularization for Deep Imbalanced Regression

Data imbalance, in which a plurality of the data samples come from a sma...
research
08/23/2022

Enhancement Encoding: A New Imbalanced Classification Approach via Encoding the Labels

Class imbalance, which is also called long-tailed distribution, is a com...
research
07/06/2021

GCN-Based Linkage Prediction for Face Clustering on Imbalanced Datasets: An Empirical Study

In recent years, benefiting from the expressive power of Graph Convoluti...
research
11/16/2022

A Basic Algorithm for Generating Individualized Numerical Scale (BAGINS)

Linguistic labels are effective means of expressing qualitative assessme...

Please sign up or login with your details

Forgot password? Click here to reset