Discovering Distribution Shifts using Latent Space Representations

02/04/2022
by   Leo Betthauser, et al.
0

Rapid progress in representation learning has led to a proliferation of embedding models, and to associated challenges of model selection and practical application. It is non-trivial to assess a model's generalizability to new, candidate datasets and failure to generalize may lead to poor performance on downstream tasks. Distribution shifts are one cause of reduced generalizability, and are often difficult to detect in practice. In this paper, we use the embedding space geometry to propose a non-parametric framework for detecting distribution shifts, and specify two tests. The first test detects shifts by establishing a robustness boundary, determined by an intelligible performance criterion, for comparing reference and candidate datasets. The second test detects shifts by featurizing and classifying multiple subsamples of two datasets as in-distribution and out-of-distribution. In evaluation, both tests detect model-impacting distribution shifts, in various shift scenarios, for both synthetic and real-world datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/21/2021

A Fine-Grained Analysis on Distribution Shift

Robustness to distribution shifts is critical for deploying machine lear...
research
09/22/2022

Assessing Robustness of EEG Representations under Data-shifts via Latent Space and Uncertainty Analysis

The recent availability of large datasets in bio-medicine has inspired t...
research
07/07/2021

Test for non-negligible adverse shifts

Statistical tests for dataset shift are susceptible to false alarms: the...
research
11/30/2022

Rethinking Out-of-Distribution Detection From a Human-Centric Perspective

Out-Of-Distribution (OOD) detection has received broad attention over th...
research
09/08/2022

Black-Box Audits for Group Distribution Shifts

When a model informs decisions about people, distribution shifts can cre...
research
10/12/2021

Tracking the risk of a deployed model and detecting harmful distribution shifts

When deployed in the real world, machine learning models inevitably enco...
research
10/13/2022

Disentanglement of Correlated Factors via Hausdorff Factorized Support

A grand goal in deep learning research is to learn representations capab...

Please sign up or login with your details

Forgot password? Click here to reset