Tracking the risk of a deployed model and detecting harmful distribution shifts

10/12/2021
by   Aleksandr Podkopaev, et al.
0

When deployed in the real world, machine learning models inevitably encounter changes in the data distribution, and certain – but not all – distribution shifts could result in significant performance degradation. In practice, it may make sense to ignore benign shifts, under which the performance of a deployed model does not degrade substantially, making interventions by a human expert (or model retraining) unnecessary. While several works have developed tests for distribution shifts, these typically either use non-sequential methods, or detect arbitrary shifts (benign or harmful), or both. We argue that a sensible method for firing off a warning has to both (a) detect harmful shifts while ignoring benign ones, and (b) allow continuous monitoring of model performance without increasing the false alarm rate. In this work, we design simple sequential tools for testing if the difference between source (training) and target (test) distributions leads to a significant drop in a risk function of interest, like accuracy or calibration. Recent advances in constructing time-uniform confidence sequences allow efficient aggregation of statistical evidence accumulated during the tracking process. The designed framework is applicable in settings where (some) true labels are revealed after the prediction is performed, or when batches of labels become available in a delayed fashion. We demonstrate the efficacy of the proposed framework through an extensive empirical study on a collection of simulated and real datasets.

READ FULL TEXT
research
03/29/2021

Learning Under Adversarial and Interventional Shifts

Machine learning models are often trained on data from one distribution ...
research
12/15/2020

Variational Beam Search for Online Learning with Distribution Shifts

We consider the problem of online learning in the presence of sudden dis...
research
10/19/2022

"Why did the Model Fail?": Attributing Model Performance Changes to Distribution Shifts

Performance of machine learning models may differ between training and d...
research
02/04/2022

Discovering Distribution Shifts using Latent Space Representations

Rapid progress in representation learning has led to a proliferation of ...
research
12/14/2020

Learning how to approve updates to machine learning algorithms in non-stationary settings

Machine learning algorithms in healthcare have the potential to continua...
research
05/27/2023

Auditing Fairness by Betting

We provide practical, efficient, and nonparametric methods for auditing ...
research
07/27/2023

Towards Practicable Sequential Shift Detectors

There is a growing awareness of the harmful effects of distribution shif...

Please sign up or login with your details

Forgot password? Click here to reset