Evaluating probabilistic classifiers: Reliability diagrams and score decompositions revisited

08/07/2020
by   Timo Dimitriadis, et al.
0

A probability forecast or probabilistic classifier is reliable or calibrated if the predicted probabilities are matched by ex post observed frequencies, as examined visually in reliability diagrams. The classical binning and counting approach to plotting reliability diagrams has been hampered by a lack of stability under unavoidable, ad hoc implementation decisions. Here we introduce the CORP approach, which generates provably statistically Consistent, Optimally binned, and Reproducible reliability diagrams in an automated way. CORP is based on non-parametric isotonic regression and implemented via the Pool-adjacent-violators (PAV) algorithm - essentially, the CORP reliability diagram shows the graph of the PAV- (re)calibrated forecast probabilities. The CORP approach allows for uncertainty quantification via either resampling techniques or asymptotic theory, furnishes a new numerical measure of miscalibration, and provides a CORP based Brier score decomposition that generalizes to any proper scoring rule. We anticipate that judicious uses of the PAV algorithm yield improved tools for diagnostics and inference for a very wide range of statistical and machine learning methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/25/2023

Evaluating Probabilistic Classifiers: The Triptych

Probability forecasts for binary outcomes, often referred to as probabil...
research
08/06/2021

Regression Diagnostics meets Forecast Evaluation: Conditional Calibration, Reliability Diagrams, and Coefficient of Determination

Model diagnostics and forecast evaluation are two sides of the same coin...
research
09/21/2023

Smooth ECE: Principled Reliability Diagrams via Kernel Smoothing

Calibration measures and reliability diagrams are two fundamental tools ...
research
02/19/2019

Evaluating model calibration in classification

Probabilistic classifiers output a probability distribution on target cl...
research
07/27/2022

Calibrate: Interactive Analysis of Probabilistic Model Output

Analyzing classification model performance is a crucial task for machine...
research
06/28/2021

More on verification of probability forecasts for football outcomes: score decompositions, reliability, and discrimination analyses

Forecast of football outcomes in terms of Home Win, Draw and Away Win re...
research
04/25/2023

Towards Reliable Colorectal Cancer Polyps Classification via Vision Based Tactile Sensing and Confidence-Calibrated Neural Networks

In this study, toward addressing the over-confident outputs of existing ...

Please sign up or login with your details

Forgot password? Click here to reset