Effects of sampling skewness of the importance-weighted risk estimator on model selection

04/19/2018
by   Wouter M. Kouw, et al.
0

Importance-weighting is a popular and well-researched technique for dealing with sample selection bias and covariate shift. It has desirable characteristics such as unbiasedness, consistency and low computational complexity. However, weighting can have a detrimental effect on an estimator as well. In this work, we empirically show that the sampling distribution of an importance-weighted estimator can be skewed. For sample selection bias settings, and for small sample sizes, the importance-weighted risk estimator produces overestimates for datasets in the body of the sampling distribution, i.e. the majority of cases, and large underestimates for data sets in the tail of the sampling distribution. These over- and underestimates of the risk lead to suboptimal regularization parameters when used for importance-weighted validation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/17/2017

On reducing sampling variance in covariate shift using control variates

Covariate shift classification problems can in principle be tackled by i...
research
06/07/2018

Importance weighted generative networks

Deep generative networks can simulate from a complex target distribution...
research
09/09/2022

Fast and Accurate Importance Weighting for Correcting Sample Bias

Bias in datasets can be very detrimental for appropriate statistical est...
research
05/24/2023

Generalizing Importance Weighting to A Universal Solver for Distribution Shift Problems

Distribution shift (DS) may have two levels: the distribution itself cha...
research
03/03/2022

Learning Selection Bias and Group Importance: Differentiable Reparameterization for the Hypergeometric Distribution

Partitioning a set of elements into a given number of groups of a priori...
research
04/06/2021

A new weighting method when not all the events are selected as cases in a nested case-control study

Nested case-control (NCC) is a sampling method widely used for developin...
research
12/19/2021

Rethinking Importance Weighting for Transfer Learning

A key assumption in supervised learning is that training and test data f...

Please sign up or login with your details

Forgot password? Click here to reset