An analytic formulation for positive-unlabeled learning via weighted integral probability metric

01/28/2019
by   Yongchan Kwon, et al.
0

We consider the problem of learning a binary classifier from only positive and unlabeled observations (PU learning). Although recent research in PU learning has succeeded in showing theoretical and empirical performance, most existing algorithms need to solve either a convex or a non-convex optimization problem and thus are not suitable for large-scale datasets. In this paper, we propose a simple yet theoretically grounded PU learning algorithm by extending the previous work proposed for supervised binary classification (Sriperumbudur et al., 2012). The proposed PU learning algorithm produces a closed-form classifier when the hypothesis space is a closed ball in reproducing kernel Hilbert space. In addition, we establish upper bounds of the estimation error and the excess risk. The obtained estimation error bound is sharper than existing results and the excess risk bound does not rely on an approximation error term. To the best of our knowledge, we are the first to explicitly derive the excess risk bound in the field of PU learning. Finally, we conduct extensive numerical experiments using both synthetic and real datasets, demonstrating improved accuracy, scalability, and robustness of the proposed algorithm.

READ FULL TEXT
research
05/23/2019

Binary Classification with Bounded Abstention Rate

We consider the problem of binary classification with abstention in the ...
research
01/17/2022

Risk bounds for PU learning under Selected At Random assumption

Positive-unlabeled learning (PU learning) is known as a special case of ...
research
03/10/2016

Theoretical Comparisons of Positive-Unlabeled Learning against Positive-Negative Learning

In PU learning, a binary classifier is trained from positive (P) and unl...
research
04/19/2021

Robust Uncertainty Bounds in Reproducing Kernel Hilbert Spaces: A Convex Optimization Approach

Let a labeled dataset be given with scattered samples and consider the h...
research
03/03/2023

On the complexity of PAC learning in Hilbert spaces

We study the problem of binary classification from the point of view of ...
research
08/31/2015

Wald-Kernel: Learning to Aggregate Information for Sequential Inference

Sequential hypothesis testing is a desirable decision making strategy in...
research
01/26/2023

Returning The Favour: When Regression Benefits From Probabilistic Causal Knowledge

A directed acyclic graph (DAG) provides valuable prior knowledge that is...

Please sign up or login with your details

Forgot password? Click here to reset