Plots of the cumulative differences between observed and expected values of ordered Bernoulli variates

06/03/2020
by   Mark Tygert, et al.
0

Many predictions are probabilistic in nature; for example, a prediction could be for precipitation tomorrow, but with only a 30 percent chance. Given both the predictions and the actual outcomes, "reliability diagrams" (also known as "calibration plots") help detect and diagnose statistically significant discrepancies between the predictions and the outcomes. The canonical reliability diagrams are based on histogramming the observed and expected values of the predictions; several variants of the standard reliability diagrams propose to replace the hard histogram binning with soft kernel density estimation using smooth convolutional kernels of widths similar to the widths of the bins. In all cases, an important question naturally arises: which widths are best (or are multiple plots with different widths better)? Rather than answering this question, plots of the cumulative differences between the observed and expected values largely avoid the question, by displaying miscalibration directly as the slopes of secant lines for the graphs. Slope is easy to perceive with quantitative precision even when the constant offsets of the secant lines are irrelevant. There is no need to bin or perform kernel density estimation with a somewhat arbitrary kernel.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/19/2022

Metrics of calibration for probabilistic predictions

Predictions are often probabilities; e.g., a prediction could be for pre...
research
08/04/2020

Plotting the cumulative deviation of a subgroup from the full population as a function of score

Assessing whether a subgroup of a full population is getting treated equ...
research
09/21/2023

Smooth ECE: Principled Reliability Diagrams via Kernel Smoothing

Calibration measures and reliability diagrams are two fundamental tools ...
research
05/07/2020

Fast multivariate empirical cumulative distribution function with connection to kernel density estimation

This paper revisits the problem of computing empirical cumulative distri...
research
03/03/2022

Kernel Density Estimation by Genetic Algorithm

This study proposes a data condensation method for multivariate kernel d...
research
02/10/2022

Multiclass histogram-based thresholding using kernel density estimation and scale-space representations

We present a new method for multiclass thresholding of a histogram which...
research
08/05/2021

Cumulative differences between subpopulations

Comparing the differences in outcomes (that is, in "dependent variables"...

Please sign up or login with your details

Forgot password? Click here to reset