Mitigating bias in calibration error estimation

12/15/2020
by   Rebecca Roelofs, et al.
9

Building reliable machine learning systems requires that we correctly understand their level of confidence. Calibration focuses on measuring the degree of accuracy in a model's confidence and most research in calibration focuses on techniques to improve an empirical estimate of calibration error, ECE_bin. Using simulation, we show that ECE_bin can systematically underestimate or overestimate the true calibration error depending on the nature of model miscalibration, the size of the evaluation data set, and the number of bins. Critically, ECE_bin is more strongly biased for perfectly calibrated models. We propose a simple alternative calibration error metric, ECE_sweep, in which the number of bins is chosen to be as large as possible while preserving monotonicity in the calibration function. Evaluating our measure on distributions fit to neural network confidence scores on CIFAR-10, CIFAR-100, and ImageNet, we show that ECE_sweep produces a less biased estimator of calibration error and therefore should be used by any researcher wishing to evaluate the calibration of models trained on similar datasets.

READ FULL TEXT

page 16

page 17

page 18

page 19

page 20

page 21

research
03/19/2023

Calibration of Neural Networks

Neural networks solving real-world problems are often required not only ...
research
09/15/2021

Making Heads and Tails of Models with Marginal Calibration for Sparse Tagsets

For interpreting the behavior of a probabilistic model, it is useful to ...
research
10/05/2022

The Calibration Generalization Gap

Calibration is a fundamental property of a good predictive model: it req...
research
02/11/2021

When and How Mixup Improves Calibration

In many machine learning applications, it is important for the model to ...
research
07/07/2022

Calibrate to Interpret

Trustworthy machine learning is driving a large number of ML community w...
research
11/14/2022

Calibrated Interpretation: Confidence Estimation in Semantic Parsing

Task-oriented semantic parsing is increasingly being used in user-facing...
research
05/26/2019

Towards reliable and fair probabilistic predictions: field-aware calibration with neural networks

In machine learning, it is observed that probabilistic predictions somet...

Please sign up or login with your details

Forgot password? Click here to reset