Improving State-of-the-Art in One-Class Classification by Leveraging Unlabeled Data

03/14/2022
by   Farid Bagirov, et al.
0

When dealing with binary classification of data with only one labeled class data scientists employ two main approaches, namely One-Class (OC) classification and Positive Unlabeled (PU) learning. The former only learns from labeled positive data, whereas the latter also utilizes unlabeled data to improve the overall performance. Since PU learning utilizes more data, we might be prone to think that when unlabeled data is available, the go-to algorithms should always come from the PU group. However, we find that this is not always the case if unlabeled data is unreliable, i.e. contains limited or biased latent negative data. We perform an extensive experimental study of a wide list of state-of-the-art OC and PU algorithms in various scenarios as far as unlabeled data reliability is concerned. Furthermore, we propose PU modifications of state-of-the-art OC algorithms that are robust to unreliable unlabeled data, as well as a guideline to similarly modify other OC algorithms. Our main practical recommendation is to use state-of-the-art PU algorithms when unlabeled data is reliable and to use the proposed modifications of state-of-the-art OC algorithms otherwise. Additionally, we outline procedures to distinguish the cases of reliable and unreliable unlabeled data using statistical tests.

READ FULL TEXT

page 3

page 15

page 19

research
03/08/2021

A Novel Perspective for Positive-Unlabeled Learning via Noisy Labels

Positive-unlabeled learning refers to the process of training a binary c...
research
07/13/2019

Bringing Giant Neural Networks Down to Earth with Unlabeled Data

Compressing giant neural networks has gained much attention for their ex...
research
06/28/2016

Estimating the class prior and posterior from noisy positives and unlabeled data

We develop a classification algorithm for estimating posterior distribut...
research
09/06/2023

Community-Based Hierarchical Positive-Unlabeled (PU) Model Fusion for Chronic Disease Prediction

Positive-Unlabeled (PU) Learning is a challenge presented by binary clas...
research
08/26/2022

Confusion Matrices and Accuracy Statistics for Binary Classifiers Using Unlabeled Data: The Diagnostic Test Approach

Medical researchers have solved the problem of estimating the sensitivit...
research
11/05/2016

Class-prior Estimation for Learning from Positive and Unlabeled Data

We consider the problem of estimating the class prior in an unlabeled da...
research
05/19/2022

A Boosting Algorithm for Positive-Unlabeled Learning

Positive-unlabeled (PU) learning deals with binary classification proble...

Please sign up or login with your details

Forgot password? Click here to reset