Calibration tests beyond classification

10/21/2022
by   David Widmann, et al.
0

Most supervised machine learning tasks are subject to irreducible prediction errors. Probabilistic predictive models address this limitation by providing probability distributions that represent a belief over plausible targets, rather than point estimates. Such models can be a valuable tool in decision-making under uncertainty, provided that the model output is meaningful and interpretable. Calibrated models guarantee that the probabilistic predictions are neither over- nor under-confident. In the machine learning literature, different measures and statistical tests have been proposed and studied for evaluating the calibration of classification models. For regression problems, however, research has been focused on a weaker condition of calibration based on predicted quantiles for real-valued targets. In this paper, we propose the first framework that unifies calibration evaluation and tests for general probabilistic predictive models. It applies to any such model, including classification and regression models of arbitrary dimension. Furthermore, the framework generalizes existing measures and provides a more intuitive reformulation of a recently proposed framework for calibration in multi-class classification. In particular, we reformulate and generalize the kernel calibration error, its estimators, and hypothesis tests using scalar-valued kernels, and evaluate the calibration of real-valued regression problems.

READ FULL TEXT

page 2

page 13

research
10/24/2019

Calibration tests in multi-class classification: A unifying framework

In safety-critical applications a probabilistic model is usually require...
research
06/20/2018

Non-Parametric Calibration of Probabilistic Regression

The task of calibration is to retrospectively adjust the outputs from a ...
research
05/26/2020

Improving Regression Uncertainty Estimates with an Empirical Prior

While machine learning models capable of producing uncertainty estimates...
research
11/14/2022

Deep Autoregressive Regression

In this work, we demonstrate that a major limitation of regression using...
research
02/19/2019

Evaluating model calibration in classification

Probabilistic classifiers output a probability distribution on target cl...
research
08/06/2021

Regression Diagnostics meets Forecast Evaluation: Conditional Calibration, Reliability Diagrams, and Coefficient of Determination

Model diagnostics and forecast evaluation are two sides of the same coin...
research
12/16/2022

Easy Uncertainty Quantification (EasyUQ): Generating predictive distributions from single-valued model output

How can we quantify uncertainty if our favorite computational tool - be ...

Please sign up or login with your details

Forgot password? Click here to reset