Distance Matters For Improving Performance Estimation Under Covariate Shift

08/14/2023
by   Mélanie Roschewitz, et al.
0

Performance estimation under covariate shift is a crucial component of safe AI model deployment, especially for sensitive use-cases. Recently, several solutions were proposed to tackle this problem, most leveraging model predictions or softmax confidence to derive accuracy estimates. However, under dataset shifts, confidence scores may become ill-calibrated if samples are too far from the training distribution. In this work, we show that taking into account distances of test samples to their expected training distribution can significantly improve performance estimation under covariate shift. Precisely, we introduce a "distance-check" to flag samples that lie too far from the expected distribution, to avoid relying on their untrustworthy model outputs in the accuracy estimation step. We demonstrate the effectiveness of this method on 13 image classification tasks, across a wide-range of natural and synthetic distribution shifts and hundreds of models, with a median relative MAE improvement of 27 performance on 10 out of 13 tasks. Our code is publicly available at https://github.com/melanibe/distance_matters_performance_estimation.

READ FULL TEXT

page 6

page 8

page 14

research
10/28/2021

Exploring Covariate and Concept Shift for Detection and Calibration of Out-of-Distribution Data

Moving beyond testing on in-distribution data works on Out-of-Distributi...
research
06/06/2020

Self-Supervised Dynamic Networks for Covariate Shift Robustness

As supervised learning still dominates most AI applications, test-time p...
research
07/07/2021

Predicting with Confidence on Unseen Distributions

Recent work has shown that the performance of machine learning models ca...
research
12/05/2022

Blessings and Curses of Covariate Shifts: Adversarial Learning Dynamics, Directional Convergence, and Equilibria

Covariate distribution shifts and adversarial perturbations present robu...
research
06/01/2023

(Almost) Provable Error Bounds Under Distribution Shift via Disagreement Discrepancy

We derive an (almost) guaranteed upper bound on the error of deep neural...
research
06/15/2023

Feed Two Birds with One Scone: Exploiting Wild Data for Both Out-of-Distribution Generalization and Detection

Modern machine learning models deployed in the wild can encounter both c...
research
04/09/2023

Reweighted Mixup for Subpopulation Shift

Subpopulation shift exists widely in many real-world applications, which...

Please sign up or login with your details

Forgot password? Click here to reset