Multi-Label Sampling based on Local Label Imbalance

05/07/2020
by   Bin Liu, et al.
10

Class imbalance is an inherent characteristic of multi-label data that hinders most multi-label learning methods. One efficient and flexible strategy to deal with this problem is to employ sampling techniques before training a multi-label learning model. Although existing multi-label sampling approaches alleviate the global imbalance of multi-label datasets, it is actually the imbalance level within the local neighbourhood of minority class examples that plays a key role in performance degradation. To address this issue, we propose a novel measure to assess the local label imbalance of multi-label datasets, as well as two multi-label sampling approaches based on the local label imbalance, namely MLSOL and MLUL. By considering all informative labels, MLSOL creates more diverse and better labeled synthetic instances for difficult examples, while MLUL eliminates instances that are harmful to their local region. Experimental results on 13 multi-label datasets demonstrate the effectiveness of the proposed measure and sampling approaches for a variety of evaluation metrics, particularly in the case of an ensemble of classifiers trained on repeated samples of the original data.

READ FULL TEXT
research
05/02/2019

Synthetic Oversampling of Multi-Label Data based on Local Label Distribution

Class-imbalance is an inherent characteristic of multi-label data which ...
research
07/30/2018

Making Classifier Chains Resilient to Class Imbalance

Class imbalance is an intrinsic characteristic of multi-label data. Most...
research
09/25/2021

Integrating Unsupervised Clustering and Label-specific Oversampling to Tackle Imbalanced Multi-label Data

There is often a mixture of very frequent labels and very infrequent lab...
research
05/22/2021

PLM: Partial Label Masking for Imbalanced Multi-label Classification

Neural networks trained on real-world datasets with long-tailed label di...
research
05/09/2023

Towards Understanding Generalization of Macro-AUC in Multi-label Learning

Macro-AUC is the arithmetic mean of the class-wise AUCs in multi-label l...
research
10/25/2022

Towards Trustworthy Multi-label Sewer Defect Classification via Evidential Deep Learning

An automatic vision-based sewer inspection plays a key role of sewage sy...

Please sign up or login with your details

Forgot password? Click here to reset