Distribution-free binary classification: prediction sets, confidence intervals and calibration

06/18/2020
by   Chirag Gupta, et al.
0

We study three notions of uncertainty quantification—calibration, confidence intervals and prediction sets—for binary classification in the distribution-free setting, that is without making any distributional assumptions on the data. With a focus towards calibration, we establish a 'tripod' of theorems that connect these three notions for score-based classifiers. A direct implication is that distribution-free calibration is only possible, even asymptotically, using a scoring function whose level sets partition the feature space into at most countably many sets. Parametric calibration schemes such as variants of Platt scaling do not satisfy this requirement, while nonparametric schemes based on binning do. To close the loop, we derive distribution-free confidence intervals for binned probabilities for both fixed-width and uniform-mass binning. As a consequence of our 'tripod' theorems, these confidence intervals for binned probabilities lead to distribution-free calibration. We also derive extensions to settings with streaming data and covariate shift.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/15/2021

A Gentle Introduction to Conformal Prediction and Distribution-Free Uncertainty Quantification

Black-box machine learning learning methods are now routinely used in hi...
research
04/20/2020

Is distribution-free inference possible for binary regression?

For a regression problem with a binary label response, we examine the pr...
research
03/15/2017

Online Learning for Distribution-Free Prediction

We develop an online learning method for prediction, which is important ...
research
07/09/2020

Predictive Value Generalization Bounds

In this paper, we study a bi-criterion framework for assessing scoring f...
research
03/11/2022

Distribution-free Prediction Sets Adaptive to Unknown Covariate Shift

Predicting sets of outcomes – instead of unique outcomes – is a promisin...
research
12/12/2019

Calibrated model-based evidential clustering using bootstrapping

Evidential clustering is an approach to clustering in which cluster-memb...
research
08/02/2023

Beta-trees: Multivariate histograms with confidence statements

Multivariate histograms are difficult to construct due to the curse of d...

Please sign up or login with your details

Forgot password? Click here to reset