Calibrated Learning to Defer with One-vs-All Classifiers

02/08/2022
by   Rajeev Verma, et al.
0

The learning to defer (L2D) framework has the potential to make AI systems safer. For a given input, the system can defer the decision to a human if the human is more likely than the model to take the correct action. We study the calibration of L2D systems, investigating if the probabilities they output are sound. We find that Mozannar Sontag's (2020) multiclass framework is not calibrated with respect to expert correctness. Moreover, it is not even guaranteed to produce valid probabilities due to its parameterization being degenerate for this purpose. We propose an L2D system based on one-vs-all classifiers that is able to produce calibrated probabilities of expert correctness. Furthermore, our loss function is also a consistent surrogate for multiclass L2D, like Mozannar Sontag's (2020). Our experiments verify that not only is our system calibrated, but this benefit comes at no cost to accuracy. Our model's accuracy is always comparable (and often superior) to Mozannar Sontag's (2020) model's in tasks ranging from hate speech detection to galaxy classification to diagnosis of skin lesions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/30/2022

Learning to Defer to Multiple Experts: Consistent Surrogate Losses, Confidence Calibration, and Conformal Ensembles

We study the statistical properties of learning to defer (L2D) to multip...
research
12/15/2022

Calibrating AI Models for Wireless Communications via Conformal Prediction

When used in complex engineered systems, such as communication networks,...
research
02/11/2021

Sample Efficient Learning of Image-Based Diagnostic Classifiers Using Probabilistic Labels

Deep learning approaches often require huge datasets to achieve good gen...
research
03/13/2013

Guess-And-Verify Heuristics for Reducing Uncertainties in Expert Classification Systems

An expert classification system having statistical information about the...
research
09/01/2020

Performance-Agnostic Fusion of Probabilistic Classifier Outputs

We propose a method for combining probabilistic outputs of classifiers t...
research
11/04/2021

Scaffolding Sets

Predictors map individual instances in a population to the interval [0,1...
research
10/28/2022

Stop Measuring Calibration When Humans Disagree

Calibration is a popular framework to evaluate whether a classifier know...

Please sign up or login with your details

Forgot password? Click here to reset