Mixture Proportion Estimation and PU Learning: A Modern Approach

11/01/2021
by   Saurabh Garg, et al.
0

Given only positive examples and unlabeled examples (from both positive and negative classes), we might hope nevertheless to estimate an accurate positive-versus-negative classifier. Formally, this task is broken down into two subtasks: (i) Mixture Proportion Estimation (MPE) – determining the fraction of positive examples in the unlabeled data; and (ii) PU-learning – given such an estimate, learning the desired positive-versus-negative classifier. Unfortunately, classical methods for both problems break down in high-dimensional settings. Meanwhile, recently proposed heuristics lack theoretical coherence and depend precariously on hyperparameter tuning. In this paper, we propose two simple techniques: Best Bin Estimation (BBE) (for MPE); and Conditional Value Ignoring Risk (CVIR), a simple objective for PU-learning. Both methods dominate previous approaches empirically, and for BBE, we establish formal guarantees that hold whenever we can train a model to cleanly separate out a small subset of positive examples. Our final algorithm (TED)^n, alternates between the two procedures, significantly improving both our mixture proportion estimator and classifier

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/02/2017

Recovering True Classifier Performance in Positive-Unlabeled Learning

A common approach in positive-unlabeled learning is to train a classific...
research
01/30/2018

Mixture Proportion Estimation for Positive--Unlabeled Learning via Classifier Dimension Reduction

Positive--unlabeled (PU) learning considers two samples, a positive set ...
research
06/02/2023

Mixture Proportion Estimation Beyond Irreducibility

The task of mixture proportion estimation (MPE) is to estimate the weigh...
research
01/08/2016

Nonparametric semi-supervised learning of class proportions

The problem of developing binary classifiers from positive and unlabeled...
research
10/27/2022

Learning One-Class Hyperspectral Classifier from Positive and Unlabeled Data for Low Proportion Target

Hyperspectral imagery (HSI) one-class classification is aimed at identif...
research
03/08/2016

Mixture Proportion Estimation via Kernel Embedding of Distributions

Mixture proportion estimation (MPE) is the problem of estimating the wei...
research
02/10/2020

Towards Mixture Proportion Estimation without Irreducibility

Mixture proportion estimation (MPE) is a fundamental problem of practica...

Please sign up or login with your details

Forgot password? Click here to reset