DEDPUL: Method for Mixture Proportion Estimation and Positive-Unlabeled Classification based on Density Estimation

02/19/2019
by   Dmitry Ivanov, et al.
0

This paper studies Positive-Unlabeled Classification, the problem of semi-supervised binary classification in the case when Negative (N) class in the training set is contaminated with instances of Positive (P) class. We develop a novel method (DEDPUL) that simultaneously solves two problems concerning the contaminated Unlabeled (U) sample: estimates the proportions of the mixing components (P and N) in U, and classifies U. By conducting experiments on synthetic and real-world data we favorably compare DEDPUL with current state-of-the-art methods for both problems. We introduce an automatic procedure for DEDPUL hyperparameter optimization. Additionally, we improve two methods in the literature and achieve DEDPUL level of performance with one of them.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset