DeepAI AI Chat
Log In Sign Up

Estimating Classification Confidence Using Kernel Densities

by   Peter Salamon, et al.

This paper investigates the post-hoc calibration of confidence for "exploratory" machine learning classification problems. The difficulty in these problems stems from the continuing desire to push the boundaries of which categories have enough examples to generalize from when curating datasets, and confusion regarding the validity of those categories. We argue that for such problems the "one-versus-all" approach (top-label calibration) must be used rather than the "calibrate-the-full-response-matrix" approach advocated elsewhere in the literature. We introduce and test four new algorithms designed to handle the idiosyncrasies of category-specific confidence estimation. Chief among these methods is the use of kernel density ratios for confidence calibration including a novel, bulletproof algorithm for choosing the bandwidth. We test our claims and explore the limits of calibration on a bioinformatics application (PhANNs) as well as the classic MNIST benchmark. Finally, our analysis argues that post-hoc calibration should always be performed, should be based only on the test dataset, and should be sanity-checked visually.


page 4

page 16

page 18

page 20

page 23

page 27

page 34

page 35


Post-hoc Uncertainty Calibration for Domain Drift Scenarios

We address the problem of uncertainty calibration. While standard deep n...

Calibrating Deep Neural Network Classifiers on Out-of-Distribution Datasets

To increase the trustworthiness of deep neural network (DNN) classifiers...

On the Usefulness of the Fit-on-the-Test View on Evaluating Calibration of Classifiers

Every uncalibrated classifier has a corresponding true calibration map t...

Post-hoc Calibration of Neural Networks

Calibration of neural networks is a critical aspect to consider when inc...

Top-label calibration

We study the problem of post-hoc calibration for multiclass classificati...

Heterogeneous Calibration: A post-hoc model-agnostic framework for improved generalization

We introduce the notion of heterogeneous calibration that applies a post...

Post-hoc Models for Performance Estimation of Machine Learning Inference

Estimating how well a machine learning model performs during inference i...