Towards Imbalanced Large Scale Multi-label Classification with Partially Annotated Labels

07/31/2023
by   Xin Zhang, et al.
0

Multi-label classification is a widely encountered problem in daily life, where an instance can be associated with multiple classes. In theory, this is a supervised learning method that requires a large amount of labeling. However, annotating data is time-consuming and may be infeasible for huge labeling spaces. In addition, label imbalance can limit the performance of multi-label classifiers, especially when some labels are missing. Therefore, it is meaningful to study how to train neural networks using partial labels. In this work, we address the issue of label imbalance and investigate how to train classifiers using partial labels in large labeling spaces. First, we introduce the pseudo-labeling technique, which allows commonly adopted networks to be applied in partially labeled settings without the need for additional complex structures. Then, we propose a novel loss function that leverages statistical information from existing datasets to effectively alleviate the label imbalance problem. In addition, we design a dynamic training scheme to reduce the dimension of the labeling space and further mitigate the imbalance. Finally, we conduct extensive experiments on some publicly available multi-label datasets such as COCO, NUS-WIDE, CUB, and Open Images to demonstrate the effectiveness of the proposed approach. The results show that our approach outperforms several state-of-the-art methods, and surprisingly, in some partial labeling settings, our approach even exceeds the methods trained with full labels.

READ FULL TEXT
research
10/24/2022

An Effective Approach for Multi-label Classification with Missing Labels

Compared with multi-class classification, multi-label classification tha...
research
05/12/2021

Disentangling Sampling and Labeling Bias for Learning in Large-Output Spaces

Negative sampling schemes enable efficient training given a large number...
research
12/17/2018

Multi Instance Learning For Unbalanced Data

In the context of Multi Instance Learning, we analyze the Single Instanc...
research
10/21/2021

Multi-label Classification with Partial Annotations using Class-aware Selective Loss

Large-scale multi-label classification datasets are commonly, and perhap...
research
01/12/2019

Automatic classification of geologic units in seismic images using partially interpreted examples

Geologic interpretation of large seismic stacked or migrated seismic ima...
research
09/25/2021

Data, Assemble: Leveraging Multiple Datasets with Heterogeneous and Partial Labels

The success of deep learning relies heavily on large datasets with exten...
research
07/28/2021

XFL: eXtreme Function Labeling

Reverse engineers would benefit from identifiers like function names, bu...

Please sign up or login with your details

Forgot password? Click here to reset