A Boosting Algorithm for Positive-Unlabeled Learning

05/19/2022
by   Yawen Zhao, et al.
0

Positive-unlabeled (PU) learning deals with binary classification problems when only positive (P) and unlabeled (U) data are available. A lot of PU methods based on linear models and neural networks have been proposed; however, there still lacks study on how the theoretically sound boosting-style algorithms could work with P and U data. Considering that in some scenarios when neural networks cannot perform as good as boosting algorithms even with fully-supervised data, we propose a novel boosting algorithm for PU learning: Ada-PU, which compares against neural networks. Ada-PU follows the general procedure of AdaBoost while two different distributions of P data are maintained and updated. After a weak classifier is learned on the newly updated distribution, the corresponding combining weight for the final ensemble is estimated using only PU data. We demonstrated that with a smaller set of base classifiers, the proposed method is guaranteed to keep the theoretical properties of boosting algorithm. In experiments, we showed that Ada-PU outperforms neural networks on benchmark PU datasets. We also study a real-world dataset UNSW-NB15 in cyber security and demonstrated that Ada-PU has superior performance for malicious activities detection.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/27/2012

An Online Boosting Algorithm with Theoretical Justifications

We study the task of online boosting--combining online weak learners int...
research
08/15/2011

A theory of multiclass boosting

Boosting combines weak classifiers to form highly accurate predictors. A...
research
07/27/2022

Learning from Positive and Unlabeled Data with Augmented Classes

Positive Unlabeled (PU) learning aims to learn a binary classifier from ...
research
03/14/2022

Improving State-of-the-Art in One-Class Classification by Leveraging Unlabeled Data

When dealing with binary classification of data with only one labeled cl...
research
10/05/2010

A bagging SVM to learn from positive and unlabeled examples

We consider the problem of learning a binary classifier from a training ...
research
09/06/2023

Community-Based Hierarchical Positive-Unlabeled (PU) Model Fusion for Chronic Disease Prediction

Positive-Unlabeled (PU) Learning is a challenge presented by binary clas...
research
09/16/2010

Asymmetric Totally-corrective Boosting for Real-time Object Detection

Real-time object detection is one of the core problems in computer visio...

Please sign up or login with your details

Forgot password? Click here to reset