On Calibration and Out-of-domain Generalization

by   Yoav Wald, et al.

Out-of-domain (OOD) generalization is a significant challenge for machine learning models. To overcome it, many novel techniques have been proposed, often focused on learning models with certain invariance properties. In this work, we draw a link between OOD performance and model calibration, arguing that calibration across multiple domains can be viewed as a special case of an invariant representation leading to better OOD generalization. Specifically, we prove in a simplified setting that models which achieve multi-domain calibration are free of spurious correlations. This leads us to propose multi-domain calibration as a measurable surrogate for the OOD performance of a classifier. An important practical benefit of calibration is that there are many effective tools for calibrating classifiers. We show that these tools are easy to apply and adapt for a multi-domain setting. Using five datasets from the recently proposed WILDS OOD benchmark we demonstrate that simply re-calibrating models across multiple domains in a validation set leads to significantly improved performance on unseen test domains. We believe this intriguing connection between calibration and OOD generalization is promising from a practical point of view and deserves further research from a theoretical point of view.



There are no comments yet.


page 1

page 2

page 3

page 4


Failure of Calibration is Typical

Schervish (1985b) showed that every forecasting system is noncalibrated ...

Progressive Domain Expansion Network for Single Domain Generalization

Single domain generalization is a challenging case of model generalizati...

Is In-Domain Data Really Needed? A Pilot Study on Cross-Domain Calibration for Network Quantization

Post-training quantization methods use a set of calibration data to comp...

Batch Normalization Embeddings for Deep Domain Generalization

Domain generalization aims at training machine learning models to perfor...

On the Usefulness of the Fit-on-the-Test View on Evaluating Calibration of Classifiers

Every uncalibrated classifier has a corresponding true calibration map t...

MBCT: Tree-Based Feature-Aware Binning for Individual Uncertainty Calibration

Most machine learning classifiers only concern classification accuracy, ...

A calibration-free method for biosensing in cell manufacturing

Chimeric antigen receptor T cell therapy has demonstrated innovative the...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.