Learning from Positive and Unlabeled Data under the Selected At Random Assumption

08/27/2018
by   Jessa Bekker, et al.
0

For many interesting tasks, such as medical diagnosis and web page classification, a learner only has access to some positively labeled examples and many unlabeled examples. Learning from this type of data requires making assumptions about the true distribution of the classes and/or the mechanism that was used to select the positive examples to be labeled. The commonly made assumptions, separability of the classes and positive examples being selected completely at random, are very strong. This paper proposes a weaker assumption that assumes the positive examples to be selected at random, conditioned on some of the attributes. To learn under this assumption, an EM method is proposed. Experiments show that our method is not only very capable of learning under this assumption, but it also outperforms the state of the art for learning under the selected completely at random assumption.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/10/2018

Beyond the Selected Completely At Random Assumption for Learning from Positive and Unlabeled Data

Most positive and unlabeled data is subject to selection biases. The lab...
research
11/12/2018

Learning From Positive and Unlabeled Data: A Survey

Learning from positive and unlabeled data or PU learning is the setting ...
research
03/02/2021

Botcha: Detecting Malicious Non-Human Traffic in the Wild

Malicious bots make up about a quarter of all traffic on the web, and de...
research
03/14/2023

PULSNAR – Positive unlabeled learning selected not at random: class proportion estimation when the SCAR assumption does not hold

Positive and Unlabeled (PU) learning is a type of semi-supervised binary...
research
01/17/2022

Risk bounds for PU learning under Selected At Random assumption

Positive-unlabeled learning (PU learning) is known as a special case of ...
research
03/04/2023

Towards Improved Illicit Node Detection with Positive-Unlabelled Learning

Detecting illicit nodes on blockchain networks is a valuable task for st...
research
05/07/2019

On the assumption of independent right censoring

Various assumptions on a right-censoring mechanism to ensure consistency...

Please sign up or login with your details

Forgot password? Click here to reset