Contrastive Identification of Covariate Shift in Image Data

08/18/2021
by   Matthew L. Olson, et al.
11

Identifying covariate shift is crucial for making machine learning systems robust in the real world and for detecting training data biases that are not reflected in test data. However, detecting covariate shift is challenging, especially when the data consists of high-dimensional images, and when multiple types of localized covariate shift affect different subspaces of the data. Although automated techniques can be used to detect the existence of covariate shift, our goal is to help human users characterize the extent of covariate shift in large image datasets with interfaces that seamlessly integrate information obtained from the detection algorithms. In this paper, we design and evaluate a new visual interface that facilitates the comparison of the local distributions of training and test data. We conduct a quantitative user study on multi-attribute facial data to compare two different learned low-dimensional latent representations (pretrained ImageNet CNN vs. density ratio) and two user analytic workflows (nearest-neighbor vs. cluster-to-cluster). Our results indicate that the latent representation of our density ratio model, combined with a nearest-neighbor comparison, is the most effective at helping humans identify covariate shift.

READ FULL TEXT

page 2

page 3

page 8

page 9

research
12/06/2022

A Learning Based Hypothesis Test for Harmful Covariate Shift

The ability to quickly and accurately identify covariate shift at test t...
research
05/19/2021

More Generalizable Models For Sepsis Detection Under Covariate Shift

Sepsis is a major cause of mortality in the intensive care units (ICUs)....
research
02/26/2020

Off-Policy Evaluation and Learning for External Validity under a Covariate Shift

We consider the evaluation and training of a new policy for the evaluati...
research
02/06/2023

Adapting to Continuous Covariate Shift via Online Density Ratio Estimation

Dealing with distribution shifts is one of the central challenges for mo...
research
02/11/2019

Nearest Neighbor Median Shift Clustering for Binary Data

We describe in this paper the theory and practice behind a new modal clu...
research
04/17/2022

Fair Classification under Covariate Shift and Missing Protected Attribute – an Investigation using Related Features

This study investigated the problem of fair classification under Covaria...
research
04/19/2023

An Offline Metric for the Debiasedness of Click Models

A well-known problem when learning from user clicks are inherent biases ...

Please sign up or login with your details

Forgot password? Click here to reset