Efficient and Multiply Robust Risk Estimation under General Forms of Dataset Shift

06/28/2023
by   Hongxiang Qiu, et al.
0

Statistical machine learning methods often face the challenge of limited data available from the population of interest. One remedy is to leverage data from auxiliary source populations, which share some conditional distributions or are linked in other ways with the target domain. Techniques leveraging such dataset shift conditions are known as domain adaptation or transfer learning. Despite extensive literature on dataset shift, limited works address how to efficiently use the auxiliary populations to improve the accuracy of risk evaluation for a given machine learning task in the target population. In this paper, we study the general problem of efficiently estimating target population risk under various dataset shift conditions, leveraging semiparametric efficiency theory. We consider a general class of dataset shift conditions, which includes three popular conditions – covariate, label and concept shift – as special cases. We allow for partially non-overlapping support between the source and target populations. We develop efficient and multiply robust estimators along with a straightforward specification test of these dataset shift conditions. We also derive efficiency bounds for two other dataset shift conditions, posterior drift and location-scale shift. Simulation studies support the efficiency gains due to leveraging plausible dataset shift conditions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/12/2022

Semi-supervised Triply Robust Inductive Transfer Learning

In this work, we propose a semi-supervised triply robust inductive trans...
research
08/10/2022

Doubly Robust Augmented Model Accuracy Transfer Inference with High Dimensional Features

Due to label scarcity and covariate shift happening frequently in real-w...
research
07/09/2023

Doubly Flexible Estimation under Label Shift

In studies ranging from clinical medicine to policy research, complete d...
research
02/22/2021

A Theory of Label Propagation for Subpopulation Shift

One of the central problems in machine learning is domain adaptation. Un...
research
05/03/2021

Robust Sample Weighting to Facilitate Individualized Treatment Rule Learning for a Target Population

Learning individualized treatment rules (ITRs) is an important topic in ...
research
08/16/2022

Semi-supervised Transfer Learning for Evaluation of Model Classification Performance

In modern machine learning applications, frequent encounters of covariat...
research
07/11/2018

Quantification under prior probability shift: the ratio estimator and its extensions

The quantification problem consists of determining the prevalence of a g...

Please sign up or login with your details

Forgot password? Click here to reset