Log In Sign Up

Localized Calibration: Metrics and Recalibration

by   Rachel Luo, et al.

Probabilistic classifiers output confidence scores along with their predictions, and these confidence scores must be well-calibrated (i.e. reflect the true probability of an event) to be meaningful and useful for downstream tasks. However, existing metrics for measuring calibration are insufficient. Commonly used metrics such as the expected calibration error (ECE) only measure global trends, making them ineffective for measuring the calibration of a particular sample or subgroup. At the other end of the spectrum, a fully individualized calibration error is in general intractable to estimate from finite samples. In this work, we propose the local calibration error (LCE), a fine-grained calibration metric that spans the gap between fully global and fully individualized calibration. The LCE leverages learned features to automatically capture rich subgroups, and it measures the calibration error around each individual example via a similarity function. We then introduce a localized recalibration method, LoRe, that improves the LCE better than existing recalibration methods. Finally, we show that applying our recalibration method improves decision-making on downstream tasks.


page 1

page 2

page 3

page 4


Measuring Calibration in Deep Learning

The reliability of a machine learning model's confidence in its predicti...

Variable-Based Calibration for Machine Learning Classifiers

The deployment of machine learning classifiers in high-stakes domains re...

Calibration of Neural Networks using Splines

Calibrating neural networks is of utmost importance when employing them ...

Probability Calibration Trees

Obtaining accurate and well calibrated probability estimates from classi...

Making Heads and Tails of Models with Marginal Calibration for Sparse Tagsets

For interpreting the behavior of a probabilistic model, it is useful to ...

Estimating Expected Calibration Errors

Uncertainty in probabilistic classifiers predictions is a key concern wh...

Towards reliable and fair probabilistic predictions: field-aware calibration with neural networks

In machine learning, it is observed that probabilistic predictions somet...