"Why did the Model Fail?": Attributing Model Performance Changes to Distribution Shifts

10/19/2022
by   Haoran Zhang, et al.
0

Performance of machine learning models may differ between training and deployment for many reasons. For instance, model performance can change between environments due to changes in data quality, observing a different population than the one in training, or changes in the relationship between labels and features. These manifest as changes to the underlying data generating mechanisms, and thereby result in distribution shifts across environments. Attributing performance changes to specific shifts, such as covariate or concept shifts, is critical for identifying sources of model failures, and for taking mitigating actions that ensure robust models. In this work, we introduce the problem of attributing performance differences between environments to shifts in the underlying data generating mechanisms. We formulate the problem as a cooperative game and derive an importance weighting method for computing the value of a coalition (or a set) of distributions. The contribution of each distribution to the total performance change is then quantified as its Shapley value. We demonstrate the correctness and utility of our method on two synthetic datasets and two real-world case studies, showing its effectiveness in attributing performance changes to a wide range of distribution shifts.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/11/2023

On the Need for a Language Describing Distribution Shifts: Illustrations on Tabular Datasets

Different distribution shifts require different algorithmic and operatio...
research
10/12/2021

Tracking the risk of a deployed model and detecting harmful distribution shifts

When deployed in the real world, machine learning models inevitably enco...
research
08/29/2023

Biquality Learning: a Framework to Design Algorithms Dealing with Closed-Set Distribution Shifts

Training machine learning models from data with weak supervision and dat...
research
06/15/2022

Modeling the Data-Generating Process is Necessary for Out-of-Distribution Generalization

Real-world data collected from multiple domains can have multiple, disti...
research
10/04/2022

Data drift correction via time-varying importance weight estimator

Real-world deployment of machine learning models is challenging when dat...
research
05/06/2021

A model for cooperative scientific research inspired by the ant colony algorithm

Modern scientific research has become largely a cooperative activity in ...
research
09/18/2022

Towards Robust Off-Policy Evaluation via Human Inputs

Off-policy Evaluation (OPE) methods are crucial tools for evaluating pol...

Please sign up or login with your details

Forgot password? Click here to reset