Test for non-negligible adverse shifts

07/07/2021
by   Vathy M. Kamulete, et al.
0

Statistical tests for dataset shift are susceptible to false alarms: they are sensitive to minor differences where there is in fact adequate sample coverage and predictive performance. We propose instead a robust framework for tests of dataset shift based on outlier scores, D-SOS for short. D-SOS detects adverse shifts and can identify false alarms caused by benign ones. It posits that a new (test) sample is not substantively worse than an old (training) sample, and not that the two are equal. The key idea is to reduce observations to outlier scores and compare contamination rates. Beyond comparing distributions, users can define what worse means in terms of predictive performance and other relevant notions. We show how versatile and practical D-SOS is for a wide range of real and simulated datasets. Unlike tests of equal distribution and of goodness-of-fit, the D-SOS tests are uniquely tailored to serve as robust performance metrics to monitor model drift and dataset shift.

READ FULL TEXT

page 10

page 11

research
10/22/2022

Explanation Shift: Detecting distribution shifts on tabular data via the explanation space

As input data distributions evolve, the predictive performance of machin...
research
02/04/2022

Discovering Distribution Shifts using Latent Space Representations

Rapid progress in representation learning has led to a proliferation of ...
research
03/08/2023

Deep Hypothesis Tests Detect Clinically Relevant Subgroup Shifts in Medical Images

Distribution shifts remain a fundamental problem for the safe applicatio...
research
10/27/2019

Kernel Stein Tests for Multiple Model Comparison

We address the problem of non-parametric multiple model comparison: give...
research
05/17/2022

A unified framework for dataset shift diagnostics

Most machine learning (ML) methods assume that the data used in the trai...
research
05/30/2019

Separating an Outlier from a Change

We study the quickest change detection problem with an unknown post-chan...
research
04/17/2023

K-means Clustering Based Feature Consistency Alignment for Label-free Model Evaluation

The label-free model evaluation aims to predict the model performance on...

Please sign up or login with your details

Forgot password? Click here to reset