Quantification under prior probability shift: the ratio estimator and its extensions

07/11/2018
by   Afonso Fernandes Vaz, et al.
1

The quantification problem consists of determining the prevalence of a given label in a target population. However, one often has access to the labels in a sample from the training population but not in the target population. A common assumption in this situation is that of prior probability shift, that is, once the labels are known, the distribution of the features is the same in the training and target populations. In this paper, we derive a new lower bound for the risk of the quantification problem under the prior shift assumption. Complementing this lower bound, we present a new approximately minimax class of estimators, ratio estimators, which generalize several previous proposals in the literature. Using a weaker version of the prior shift assumption, which can be tested, we show that ratio estimators can be used to build confidence intervals for the quantification problem. We also extend the ratio estimator so that it can: (i) incorporate labels from the target population, when they are available and (ii) estimate how the prevalence of positive labels varies according to a function of certain covariates.

READ FULL TEXT
research
06/07/2023

Label Shift Quantification with Robustness Guarantees via Distribution Feature Matching

Quantification learning deals with the task of estimating the target lab...
research
01/15/2020

Generalized Bayesian Quantification Learning

Quantification Learning is the task of prevalence estimation for a test ...
research
05/03/2022

Nonparametric inference under a monotone hazard ratio order

The ratio of the hazard functions of two populations or two strata of a ...
research
02/28/2016

Does quantification without adjustments work?

Classification is the task of predicting the class labels of objects bas...
research
01/31/2023

Efficient Generalization and Transportation

When estimating causal effects, it is important to assess external valid...
research
06/28/2023

Efficient and Multiply Robust Risk Estimation under General Forms of Dataset Shift

Statistical machine learning methods often face the challenge of limited...
research
07/09/2023

Doubly Flexible Estimation under Label Shift

In studies ranging from clinical medicine to policy research, complete d...

Please sign up or login with your details

Forgot password? Click here to reset